Convolutional neural network (CNN)-based methods have been extensively used for remote sensing scene classification (RSSC) and have obtained remarkable classification results. However, its limitations in extracting global features have hindered further improvement. Transformers can directly capture global features through self-attention mechanisms, but they have deficiencies in modeling local features. Currently, an approach that directly combines CNN and Transformer features may lead to feature imbalance, and introduce redundant information. To address these problems, we propose a local and global feature adaptive adjustment network (LGFAANet) for RSSC. First, we employ a dual-branch network structure to extract local and global features from remote sensing scene images. Second, we design a local and global feature adaptive adjustment module (LGFAA) to dynamically allocate weights to the features. Third, we use a multi-layer feature aggregation module (MLFA) to aggregate the adjusted features, thereby further enhancing feature representation. Finally, we introduce joint loss to accelerate network convergence, while reducing intra-class distance and increasing inter-class distance. Experimental results demonstrate that our proposed method displays enhanced feature representation ability and outperforms existing state-of-the-art methods.
Junjie WangWei LiMengmeng ZhangYunhao GaoBoyu Zhao
Guangrui LvLili DongWenwen ZhangWenhai Xu
Yafei LvXiaohan ZhangWei XiongYaqi CuiMi Cai
Fei SongRuofei MaTao LeiZhenming Peng