Browse by author
Lookup NU author(s): Dr Shidong WangORCiD
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.Multi-modality image fusion aims to extract complementary features from multiple source images of different modalities, generating a fused image that inherits their advantages. To address challenges in cross-modality shared feature (CMSF) extraction, single-modality specific feature (SMSF) fusion, and the absence of ground truth (GT) images, we propose MTG-Fusion, a multi-text guided model. We leverage the capabilities of large vision-language models to generate text descriptions tailored to the input images, providing novel insights for these challenges. Our model introduces a text-guided CMSF extractor (TGCE) and a text-guided SMSF fusion module (TGSF). TGCE transforms visual features into the text domain using manifold-isometric domain transform techniques and provides effective visual-text interaction based on text-vision and text-text distances. TGSF fuses each dimension of visual features with corresponding text features, creating a weight matrix utilized for SMSF fusion. We also incorporate the constructed textual GT into the loss function for collaborative training. Extensive experiments demonstrate that MTG-Fusion achieves state-of-the-art performance on infrared and visible image fusion and medical image fusion tasks. The code is available at: https://github.com/zhaolb4080/MTG-Fusion.
Author(s): Wang Z, Zhao L, Zhang J, Song R, Song H, Meng J, Wang S
Publication type: Article
Publication status: Published
Journal: International Journal of Computer Vision
Year: 2025
Issue: ePub ahead of Print
Online publication date: 17/03/2025
Acceptance date: 24/02/2025
ISSN (print): 0920-5691
ISSN (electronic): 1573-1405
Publisher: Springer
URL: https://doi.org/10.1007/s11263-025-02409-3
DOI: 10.1007/s11263-025-02409-3
Altmetrics provided by Altmetric