Cross‐Transformer Fusion Network for Multimodal Remote Sensing Image Classification

Huiqing Wang; Zhongyu Li; Linfeng Wu

doi:10.1111/phor.70014

ScienceGate Book Chapters

JOURNAL ARTICLE

Cross‐Transformer Fusion Network for Multimodal Remote Sensing Image Classification

Huiqing Wang Zhongyu Li Linfeng Wu

Year: 2025 Journal: The Photogrammetric Record Vol: 40 (191) Publisher: Wiley

DOI: 10.1111/phor.70014

Get Full-Text PDF Get Analytical Report

Abstract

ABSTRACT In the earth observation mission, multimodal remote sensing (RS) image fusion technology has attracted great interest of many researchers. Although deep learning networks have made great progress in the field of multimodal RS image classification, there are still challenges in multimodal feature fusion strategies, the sequence of spectral features, and the location of spatial features. Therefore, this paper presents a novel approach for classifying multimodal RS data based on cross‐transformer fusion (CTF). Firstly, Independent Component Analysis (ICA) was used to reduce the dimension of spectral features, and dual‐branch 3D and 2D convolutional neural networks (CNNs) were used for multimodal feature extraction to significantly extract and acquire the spectral‐spatial characteristics and height‐related features across multiple modalities. Then, in order to fuse the feature information extracted from the two modalities, a cross‐transformer feature fusion strategy was designed, which used the powerful long‐distance dependence ability of transformer and the advantages of processing spectral feature sequences to effectively fuse multimodal features. By fully utilizing the strong capability of CNNs in extracting spatial context information and the transformer network architecture based on CTF fusion, the ability of recognition, extraction, and fusion of multimodal feature information can be effectively improved. To validate the efficacy of the proposed approach, three benchmark multimodal RS datasets were selected for evaluation. The experimental results demonstrate that this method outperforms existing state‐of‐the‐art techniques in terms of classification accuracy.

Keywords:

Artificial intelligence Computer science Image fusion Fusion Transformer Pattern recognition (psychology) Computer vision Image (mathematics) Engineering Electrical engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.37

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Remote-Sensing Image Classification

Physical Sciences → Engineering → Media Technology

Advanced Image Fusion Techniques

Physical Sciences → Engineering → Media Technology

Remote Sensing and Land Use

Physical Sciences → Earth and Planetary Sciences → Atmospheric Science

Cross‐Transformer Fusion Network for Multimodal Remote Sensing Image Classification

Abstract

Metrics

Topics

Related Documents

Multimodal Fusion Transformer for Remote Sensing Image Classification

A multimodal hyper-fusion transformer for remote sensing image classification

Multipath fusion transformer network for hyperspectral remote sensing image classification

Cross-layer fusion enhanced transformer network for remote sensing scene classification

Cross Attention Fusion Transformer Network for Urban Remote Sensing Image Segmentation