Multiscale Global Context Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Qiaolin Zeng; Jingxiang Zhou; Jinhua Tao; Liangfu Chen; Xuerui Niu; Yumeng Zhang

doi:10.1109/tgrs.2024.3393489

ScienceGate Book Chapters

JOURNAL ARTICLE

Multiscale Global Context Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Qiaolin Zeng Jingxiang Zhou Jinhua Tao Liangfu Chen Xuerui Niu Yumeng Zhang

Year: 2024 Journal: IEEE Transactions on Geoscience and Remote Sensing Vol: 62 Pages: 1-13 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tgrs.2024.3393489

Get Full-Text PDF Get Analytical Report

Abstract

Semantic segmentation of high-resolution remote sensing images (HRSIs) is a challenging task because objects in HRSIs usually have great scale variance and appearance variance. Although deep convolutional neural networks (DCNNs) have been widely applied in the semantic segmentation of HRSIs, they have inherent limitations in capturing global context. Attention mechanisms and transformer can effectively model long-range dependencies, but they often result in high computational costs when being applied to process HRSIs. In this article, an encoder-decoder network (MSGCNet) is proposed to fully and efficiently model multiscale context and long-range dependencies of HRSIs. Specifically, the multiscale interaction (MSI) module employs an efficient cross-attention to facilitate interaction among multiscale features of the encoder, which bridges the semantic gap between high- and low-level features and introduces more scale information to the network. In order to efficiently model long-range dependencies in both spatial and channel dimensions, the transformer-based decoder block (TBDB) implements window-based efficient multihead self-attention (W-EMSA) and enables interactions cross windows. Furthermore, to further integrate the global context generated by TBDB, the scale-aware fusion (SAF) module is proposed to deeply supervise the decoder, which iteratively fuses hierarchical features through spatial attention. As demonstrated by both quantitative and qualitative experimental results on two publicly available datasets, the proposed MSGCNet exhibits superior performance compared to currently popular methods. The code will be available at http://github.com/JingxiangZhou/MSGCNet .

Keywords:

Computer science Encoder Segmentation Convolutional neural network Artificial intelligence Transformer Context (archaeology) Data mining Pattern recognition (psychology)

Metrics

Cited By

20.29

FWCI (Field Weighted Citation Impact)

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification

Physical Sciences → Engineering → Media Technology

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multiscale Global Context Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Abstract

Metrics

Citation History

Topics

Related Documents

GLMCNet: A Global-Local Multiscale Context Network for High-Resolution Remote Sensing Image Semantic Segmentation

Multiscale Cascaded Network for the Semantic Segmentation of High-Resolution Remote Sensing Images

Semantic Segmentation of High-Resolution Remote Sensing Images Using Multiscale Skip Connection Network

Multiscale Feature Extraction Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images

HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images