Remote sensing scene (RSS) classification is an important research topic for high-resolution HR remote sensing image understanding. Recently, many approaches have been presented for the task, including data-driven and machine learning methods. However, accurately identifying scenes from HR remote sensing images remains challenging since it is difficult to effectively extract multiscale and key features from the complex geometrical structures and spatial patterns of large-scale ground object. In this paper, we propose a novel local and global semantic relationship network (LGSRNet) for RSS classification. ConvNeXt-T with the same performance as the local vision Swin Transformer is adopted to extract feature map with powerful discriminative ability. Meanwhile, the semantic relation learning (SRL) with graph convolutional networks is presented to further learn semantic relationships between labels of RSS categories within spatial domain. Subsequently, cosine similarity is adopted to incorporate the ConvNeXt-T and SRL. Extensive experiments on two attribute-classification datasets (AID and NWPU-RESISC45) demonstrate that LGSRNet outperforms several other state-of-the-art methods.
Junjie WangWei LiMengmeng ZhangYunhao GaoBoyu Zhao
Junge ShenTianwei YuHaopeng YangRuxin WangQi Wang
Jingjing MaQiushuo MaXu TangXiangrong ZhangCheng ZhuQunnie PengLicheng Jiao
Hua ZhangYindi ZhaoWeilin WangJun ZhuYu Cao
Xiumei ChenXiangtao ZhengYue ZhangXiaoqiang Lu