This paper describes a deep learning approach to semantic segmentation of very high resolution remote sensing images. We introduce RLFCN, a fully convolutional architecture based on residual logic blocks, to model the ambiguous mapping between remote sensing images and classification maps. In order to recover the output resolution to the original size, we adopt a special way to efficiently learn feature map up-sampling within the network. For optimization, we employ the equally-weighted focal loss which is particularly suitable for the task for it reduces the impact of class imbalance. Our framework consists of only one single architecture which is trained end-to-end and doesn't rely on any post-processing techniques and needs no extra data except images. Based on our framework, we conducted experiments on a ISPRS dataset: Vaihingen. The results indicate that our framework achieves better performance than the current state of the art, while containing fewer parameters and requires fewer training data.
Guanzhou ChenXiaodong ZhangQing WangFan DaiYuanfu GongKun Zhu
Hua ZhangZhengang JiangJun XuXuekun Yao
Xuejun GuoZehua ChenChengyi Wang
Zhihuan WuYongming GaoLei LiJunshi XueYuntao Li