JOURNAL ARTICLE

Mamba-STFM: A Mamba-Based Spatiotemporal Fusion Method for Remote Sensing Images

Qiyuan ZhangXiaodan ZhangChen QuanTong ZhaoWei HuoYuanchen Huang

Year: 2025 Journal:   Remote Sensing Vol: 17 (13)Pages: 2135-2135   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Spatiotemporal fusion techniques can generate remote sensing imagery with high spatial and temporal resolutions, thereby facilitating Earth observation. However, traditional methods are constrained by linear assumptions; generative adversarial networks suffer from mode collapse; convolutional neural networks struggle to capture global context; and Transformers are hard to scale due to quadratic computational complexity and high memory consumption. To address these challenges, this study introduces an end-to-end remote sensing image spatiotemporal fusion approach based on the Mamba architecture (Mamba-spatiotemporal fusion model, Mamba-STFM), marking the first application of Mamba in this domain and presenting a novel paradigm for spatiotemporal fusion model design. Mamba-STFM consists of a feature extraction encoder and a feature fusion decoder. At the core of the encoder is the visual state space-FuseCore-AttNet block (VSS-FCAN block), which deeply integrates linear complexity cross-scan global perception with a channel attention mechanism, significantly reducing quadratic-level computation and memory overhead while improving inference throughput through parallel scanning and kernel fusion techniques. The decoder’s core is the spatiotemporal mixture-of-experts fusion module (STF-MoE block), composed of our novel spatial expert and temporal expert modules. The spatial expert adaptively adjusts channel weights to optimize spatial feature representation, enabling precise alignment and fusion of multi-resolution images, while the temporal expert incorporates a temporal squeeze-and-excitation mechanism and selective state space model (SSM) techniques to efficiently capture short-range temporal dependencies, maintain linear sequence modeling complexity, and further enhance overall spatiotemporal fusion throughput. Extensive experiments on public datasets demonstrate that Mamba-STFM outperforms existing methods in fusion quality; ablation studies validate the effectiveness of each core module; and efficiency analyses and application comparisons further confirm the model’s superior performance.

Keywords:
Remote sensing Environmental science Geography

Metrics

2
Cited By
7.03
FWCI (Field Weighted Citation Impact)
56
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology
Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Image and Signal Denoising Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

EST-STFM: An Efficient Deep-Learning-Based Spatiotemporal Fusion Method for Remote Sensing Images

Qiyuan ZhangXiaodan ZhangChen QuanTong ZhaoWei HuoYuanchen Huang

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2025 Vol: 18 Pages: 18633-18655
JOURNAL ARTICLE

Algae-Mamba: A Spatially Variable Mamba for Algae Extraction From Remote Sensing Images

Y. ZhangShuaipeng WangYanlong ChenShiqing WeiMingming XuShanwei Liu

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2025 Vol: 18 Pages: 14324-14337
JOURNAL ARTICLE

HLMamba: Hybrid Lightweight Mamba-Based Fusion Network for Dense Prediction of Remote Sensing Images

Wujie ZhouPenghan YangYuanyuan Liu

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-11
© 2026 ScienceGate Book Chapters — All rights reserved.