Qiyuan ZhangXiaodan ZhangChen QuanTong ZhaoWei HuoYuanchen Huang
Spatiotemporal fusion techniques can generate remote sensing imagery with high spatial and temporal resolutions, thereby facilitating Earth observation. However, traditional methods are constrained by linear assumptions; generative adversarial networks suffer from mode collapse; convolutional neural networks struggle to capture global context; and Transformers are hard to scale due to quadratic computational complexity and high memory consumption. To address these challenges, this study introduces an end-to-end remote sensing image spatiotemporal fusion approach based on the Mamba architecture (Mamba-spatiotemporal fusion model, Mamba-STFM), marking the first application of Mamba in this domain and presenting a novel paradigm for spatiotemporal fusion model design. Mamba-STFM consists of a feature extraction encoder and a feature fusion decoder. At the core of the encoder is the visual state space-FuseCore-AttNet block (VSS-FCAN block), which deeply integrates linear complexity cross-scan global perception with a channel attention mechanism, significantly reducing quadratic-level computation and memory overhead while improving inference throughput through parallel scanning and kernel fusion techniques. The decoder’s core is the spatiotemporal mixture-of-experts fusion module (STF-MoE block), composed of our novel spatial expert and temporal expert modules. The spatial expert adaptively adjusts channel weights to optimize spatial feature representation, enabling precise alignment and fusion of multi-resolution images, while the temporal expert incorporates a temporal squeeze-and-excitation mechanism and selective state space model (SSM) techniques to efficiently capture short-range temporal dependencies, maintain linear sequence modeling complexity, and further enhance overall spatiotemporal fusion throughput. Extensive experiments on public datasets demonstrate that Mamba-STFM outperforms existing methods in fusion quality; ablation studies validate the effectiveness of each core module; and efficiency analyses and application comparisons further confirm the model’s superior performance.
Qiyuan ZhangXiaodan ZhangChen QuanTong ZhaoWei HuoYuanchen Huang
Yan XueXi YongPeiyu ZhangXianglin LiuDeqing ChenXin LyuXin Li
Y. ZhangShuaipeng WangYanlong ChenShiqing WeiMingming XuShanwei Liu
qihang sunYingchao LiuLongfei Lihuifang sunlingyun zhaoChunpeng Tian
Wujie ZhouPenghan YangYuanyuan Liu