JOURNAL ARTICLE

MFFNet: a wavelet transform-based multimodal frequency fusion network for remote sensing semantic segmentation

Chao LiHaitao LyuWeipeng JingYe YuanGuangliang Cheng

Year: 2025 Journal:   GIScience & Remote Sensing Vol: 62 (1)   Publisher: Taylor & Francis

Abstract

The use of multimodal data for semantic segmentation in remote sensing has attracted considerable interest, as it enables the integration of complementary information from various sensors. However, conventional multimodal fusion methods primarily operate in the spatial domain. Given the substantial divergence and inherent redundancy across modalities, direct fusion in the spatial domain often leads to the accumulation of irrelevant information and the loss of useful features. Furthermore, spatial-domain fusion alone is insufficient to fully exploit the complementary characteristics of multimodal data. To address these challenges, we introduce a wavelet transform-based multimodal frequency fusion network (MFFNet) to compensate for the limitations of spatial-domain fusion by introducing frequency-domain information. Specifically, we propose the spatial-frequency domain wavelet attention fusion module (SFWAF), which uses weight-shared spatial-domain branches to extract generic spatial features for different modalities. The SFWAF module uses the discrete wavelet transform (DWT) to map different modal features into the frequency domain for fusion and adaptively integrates the dual-domain features using a learnable weighting factor. Additionally, we propose a lightweight frequency-enhanced feature fusion (FEF) module for multiscale feature integration. This module fuses high-frequency components from various modalities using a fixed fusion strategy to preserve critical edge and detail information. Extensive experimental results on the ISPRS Vaihingen, ISPRS Potsdam, and WHU-OPT-SAR datasets demonstrate that MFFNet outperforms traditional multimodal fusion methods, achieving mIoU of 84.21% and 85.88% on the Vaihingen and Potsdam datasets, respectively, and overall accuracies of 92.26% and 91.16%.

Keywords:
Computer science Artificial intelligence Image fusion Fusion Wavelet Segmentation Frequency domain Pattern recognition (psychology) Sensor fusion Weighting Wavelet transform Feature (linguistics) Domain (mathematical analysis) Computer vision Data mining Mathematics Image (mathematics)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
55
Refs
0.38
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Geophysical Methods and Applications
Physical Sciences →  Engineering →  Ocean Engineering
Underwater Acoustics Research
Physical Sciences →  Earth and Planetary Sciences →  Oceanography

Related Documents

JOURNAL ARTICLE

Learning Frequency-Domain Fusion for Multimodal Remote Sensing Semantic Segmentation

Guangsheng ChenFangyu SunWeipeng JingWeitao ZouDonglin DiYang SongLei Fan

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-16
JOURNAL ARTICLE

Vision Foundation Model Guided Multimodal Fusion Network for Remote Sensing Semantic Segmentation

Pan ChenXijian FanTardi TjahjadiHaiyan GuanLiyong FuQiaolin YeRuili Wang

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2025 Vol: 18 Pages: 9409-9431
© 2026 ScienceGate Book Chapters — All rights reserved.