JOURNAL ARTICLE

Hybrid Attention Fusion Embedded in Transformer for Remote Sensing Image Semantic Segmentation

Yan ChenQuan DongXiaofeng WangQianchuan ZhangMenglei KangWenxiang JiangMengyuan WangLixiang XuChen Zhang

Year: 2024 Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Vol: 17 Pages: 4421-4435   Publisher: Institute of Electrical and Electronics Engineers

Abstract

In the context of fast progress in deep learning, convolutional neural networks have been extensively applied to the semantic segmentation of remote sensing images and have achieved significant progress. However, certain limitations exist in capturing global contextual information due to the characteristics of convolutional local properties. Recently, Transformer has become a focus of research in computer vision and has shown great potential in extracting global contextual information, further promoting the development of semantic segmentation tasks. In this article, we use ResNet50 as an encoder, embed the hybrid attention mechanism into Transformer, and propose a Transformer-based decoder. The Channel-Spatial Transformer Block further aggregates features by integrating the local feature maps extracted by the encoder with their associated global dependencies. At the same time, an adaptive approach is employed to reweight the interdependent channel maps to enhance the feature fusion. The global cross-fusion module combines the extracted complementary features to obtain more comprehensive semantic information. Extensive comparative experiments were conducted on the ISPRS Potsdam and Vaihingen datasets, where mIoU reached 78.06% and 76.37%, respectively. The outcomes of multiple ablation experiments also validate the effectiveness of the proposed method.

Keywords:
Computer science Encoder Segmentation Artificial intelligence Convolutional neural network Transformer Feature learning Deep learning Feature extraction Pattern recognition (psychology) Image segmentation Computer vision Engineering

Metrics

23
Cited By
12.19
FWCI (Field Weighted Citation Impact)
72
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology

Related Documents

JOURNAL ARTICLE

HMAFNet: Hybrid Mamba-Attention Fusion Network for Remote Sensing Image Semantic Segmentation

Haoyue SunJianjun LiuJinlong YangZebin Wu

Journal:   IEEE Geoscience and Remote Sensing Letters Year: 2025 Vol: 22 Pages: 1-5
JOURNAL ARTICLE

CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation

Xin ChenDongfen LiMingzhe LiuJiaru Jia

Journal:   Remote Sensing Year: 2023 Vol: 15 (18)Pages: 4455-4455
JOURNAL ARTICLE

CTFNet: CNN-Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation

Honglin WuPeng HuangMin ZhangWenlong Tang

Journal:   IEEE Geoscience and Remote Sensing Letters Year: 2023 Vol: 21 Pages: 1-5
JOURNAL ARTICLE

Hybrid Attention Driven CNN-Mamba Multimodal Fusion Network for Remote Sensing Image Semantic Segmentation

Shu TianMinglei LiLin CaoLihong KangJing TianXiangwei XingBo ShenKangning DuChong FuYe Zhang

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2025 Vol: 19 Pages: 2254-2272
© 2026 ScienceGate Book Chapters — All rights reserved.