Semantic segmentation, which is one of the key problems in computer vision, has been applied in various application domains such as autonomous driving, robot navigation, or medical imagery, to name a few. Recently, deep learning, especially deep neural networks, have shown significant performance improvement over conventional semantic segmentation methods. In this paper, we present a novel encoder-decoder type deep neural network-based method, namely XSeNet, that can be trained end-to-end in a supervised manner. We adapt ResNet-50 layers as the encoder and design a cascaded decoder that composes of the stack of the X-Modules, which enables the network to learning dense contextual information and having wider field-of-view. We evaluate our method using CamVid dataset, and experimental results reveal that our method can segment most part of the scene accurately and even outperforms previous state-of-the art methods.
Changki SungWan-hee KimJungho AnWooju LeeHyungtae LimHyun Myung
Jianhua WangChuanxia ZhengWeihai ChenXingming Wu
Chuanxia ZhengJianhua WangWeihai ChenXingming Wu
Guofeng TongYuyuan ShaoHao Peng