JOURNAL ARTICLE

Cross‐Transformer Fusion Network for Multimodal Remote Sensing Image Classification

Huiqing WangZhongyu LiLinfeng Wu

Year: 2025 Journal:   The Photogrammetric Record Vol: 40 (191)   Publisher: Wiley

Abstract

ABSTRACT In the earth observation mission, multimodal remote sensing (RS) image fusion technology has attracted great interest of many researchers. Although deep learning networks have made great progress in the field of multimodal RS image classification, there are still challenges in multimodal feature fusion strategies, the sequence of spectral features, and the location of spatial features. Therefore, this paper presents a novel approach for classifying multimodal RS data based on cross‐transformer fusion (CTF). Firstly, Independent Component Analysis (ICA) was used to reduce the dimension of spectral features, and dual‐branch 3D and 2D convolutional neural networks (CNNs) were used for multimodal feature extraction to significantly extract and acquire the spectral‐spatial characteristics and height‐related features across multiple modalities. Then, in order to fuse the feature information extracted from the two modalities, a cross‐transformer feature fusion strategy was designed, which used the powerful long‐distance dependence ability of transformer and the advantages of processing spectral feature sequences to effectively fuse multimodal features. By fully utilizing the strong capability of CNNs in extracting spatial context information and the transformer network architecture based on CTF fusion, the ability of recognition, extraction, and fusion of multimodal feature information can be effectively improved. To validate the efficacy of the proposed approach, three benchmark multimodal RS datasets were selected for evaluation. The experimental results demonstrate that this method outperforms existing state‐of‐the‐art techniques in terms of classification accuracy.

Keywords:
Artificial intelligence Computer science Image fusion Fusion Transformer Pattern recognition (psychology) Computer vision Image (mathematics) Engineering Electrical engineering

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
41
Refs
0.37
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology
Remote Sensing and Land Use
Physical Sciences →  Earth and Planetary Sciences →  Atmospheric Science
© 2026 ScienceGate Book Chapters — All rights reserved.