How to acquire effective information from multispectral remote sensing images is a challenging task in semantic segmentation of remote sensing images. In this paper, an end-to-end semantic segmentation network (BLASeNet) is proposed. The model adopts an encoder-decoder structure. In the encoder phase, to exploit the band correlation of multispectral remote sensing images, we propose an effective 3D Residual block to encode the spectral-spatial features of images. In order to extract more discriminative features from multispectral images, a band-location adaptive selection mechanism is proposed to adaptively learn the weights of different bands and different spatial locations within a single band, enhancing the expression of features. In the decoder phase, we introduce two trainable parameter matrices $\mathrm{W}_{\alpha}$ and Wβ in the skip connections, adaptively adjusting the fusion ratio of low-level detail features and high-level semantic features by network learning, improving the image segmentation accuracy. In addition, we extend the channel attention to 3D data, further refining the fused feature maps. Experimental results on ISPRS Potsdam and Qinghai datasets demonstrate the effectiveness of BLASeNet.
Chongxin TaoYizhuo MengJunjie LiBeibei YangFengmin HuYuanxi LiChanglu CuiWen Zhang
Qi ZhaoJiahui LiuYuewen LiHong Zhang
Kunping YangXinyi TongGui-Song XiaWeiming ShenLiangpei Zhang
Haimeng ZhaoRaihani MohamedSeng-Beng Ng