SpectFusion: Cross-modal Spectrum-aware Attention Network for Unsupervised Multimodal Medical Image Fusion

Lamei Wang; Xinyu Xie; Yun Yang; Dongping Xiong; Hong Zhou; Bin Yang; Kok Lay Teo; Bingo Wing‐Kuen Ling; Xiaozhi Zhang

doi:10.1109/jbhi.2025.3637829

ScienceGate Book Chapters

JOURNAL ARTICLE

SpectFusion: Cross-modal Spectrum-aware Attention Network for Unsupervised Multimodal Medical Image Fusion

Lamei Wang Xinyu Xie Yun Yang Dongping Xiong Hong Zhou Bin Yang Kok Lay Teo Bingo Wing‐Kuen Ling Xiaozhi Zhang

Year: 2025 Journal: IEEE Journal of Biomedical and Health Informatics Vol: PP Pages: 1-12 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/jbhi.2025.3637829

Get Full-Text PDF Get Analytical Report

Abstract

Medical image fusion aims to synthesize relevant and complementary information from different modalities, thereby enhancing clinical diagnosis. Current deep learning-based fusion approaches, particularly Transformer-based architectures, have achieved remarkable results due to their strong capacity for modeling long-range dependencies. However, there are still limitations in capturing sufficient global information because of the window-based local attention mechanism. Moreover, existing fusion schemes predominantly focus on spatial features while rarely considering spectral features, thus affecting the fusion performance. To address these challenges, we propose a new unsupervised cross-modal spectrum-aware fusion framework, named SpectFusion, for medical image fusion. Specifically, we devise a spatial-spectrum hybrid block, which effectively extracts fine-grained local features via a gradient retention strategy in the spatial domain, and captures global features with an image-wide receptive field through Fourier convolution in the frequency domain. Furthermore, we develop a novel cross-modal spectrum-aware attention to facilitate spatial-spectrum information interactions during fusion. It dynamically guides the retention of relevant spectral components while integrating multimodal spatial features. Additionally, to achieve more precise alignment image pairs, we incorporate a refined registration module to correct minor local deviations. We also define corresponding frequency and spatial domain losses to jointly constrain the proposed SpectFusion. By leveraging spatial-spectrum information interactions, fine-grained fusion can be adaptively realized. Extensive experiments, including clinical brain tumor image fusion, demonstrate that SpectFusion outperforms other state-of-the-art methods both qualitatively and quantitatively. We show that SpectFusion can boost performance in downstream tasks such as multimodal medical image segmentation. The code is available at https://github.com/PlumW/SpectFusion.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

SpectFusion: Cross-modal Spectrum-aware Attention Network for Unsupervised Multimodal Medical Image Fusion

Abstract

Metrics

Topics

Related Documents

Cross-modal Frequency-aware Transformer for Multimodal Medical Image Fusion

MFA-DAF: Unsupervised Multimodal Medical Image Fusion via Multiscale Fourier Attention and Detail-Aware Fusion Strategy

SAFFusion: a saliency-aware frequency fusion network for multimodal medical image fusion

SCAFNet: Multimodal stroke medical image synthesis and fusion network based on self attention and cross attention

IGNFusion: An Unsupervised Information Gate Network for Multimodal Medical Image Fusion