With the continuous development of computer vision, semantic segmentation of remote sensing images has been known as a research hotspot in the field of remote sensing. In this paper, we propose an encoder-decoder structure to improve the DeepLabV3+ network based on ResNet for semantic segmentation of ultra-high resolution remote sensing images. We use ResNet18, ResNet101, and ResNet152 as the backbone of the network, respectively, and combine atrous spatial pyramid pooling (ASPP) to extract the semantic information of the remote sensing images at multiple scales and then reduce the feature maps to the input size by bilinear interpolation to realize the "end-to-end" pixel-level image classification and precise localization. We conducted experiments on the ultra-high resolution remote sensing image dataset ISRPS Vaihingen to verify the effectiveness of the proposed network. The experimental results show that the improved DeepLabV3+ network outperforms the original DeepLabV3+ network by 5.05%, 9.24%, 9.05%, and 7.62% in PA, MPA, MIoU, and FWIoU, respectively.
Yuqi ZengWenzao ShiJiewei WuYuchen Zheng
Yan WangLei YangXinzhan LiuPingkun Yan
Yan WangLing YangXinzhan LiuPengfei Yan
Yan WangLei YangXinzhan LiuPingkun Yan