Modern real-time segmentation methods employ two-branch framework to achieve good speed and accuracy trade-off. However, we observe that low-level features coming from the shallow layers go through less processing, producing a potential semantic gap between different levels of features. Meanwhile, a rigid fusion is less effective due to the absence of consideration for two-branch framework characteristics. In this paper, we propose two novel modules: Unified Interplay Module and Separate Pyramid Pooling Module to address those two issues respectively. Based on our proposed modules, we present a novel Dual Stream Segmentation Network (DSSNet), a two-branch framework for real-time semantic segmentation. Compared with BiSeNet, our DSSNet based on ResNet18 achieves better performance 76.45% mIoU on the Cityscapes test dataset while sharing similar computation costs with BiSeNet. Furthermore, our DSSNet with ResNet34 backbone outperforms previous real-time models, achieving 78.5% mIoU on the Cityscapes test dataset with speed of 39 FPS on GTX1080Ti.
Xiaobo HuHongbo ZhuNing SuTaosheng Xu
Shiming XiangDong ZhouDan TianZihao Wang
Hong YinWenbin XieJingjing ZhangYuanfa ZhangWeixing ZhuJie GaoYan ShaoYajun Li
Juan WeiHao ZhangTianping LiLina Han
Xinneng YangYan WuJunqiao ZhaoFeilin Liu