Lian XuMohammed BennamounFarid BoussaïdWanli OuyangDan Xu
Abstract Weakly supervised semantic segmentation (WSSS) commonly relies on Class Activation Mapping (CAM) to produce pseudo semantic labels using image-level annotations. However, because CAM maps often form sparse object regions with poor boundaries, they cannot provide sufficient segmentation supervision. Because off-the-shelf saliency maps can provide rich object boundaries that can be leveraged to improve semantic segmentation, we propose to jointly learn semantic segmentation and class-agnostic masks by using image-level annotations and off-the-shelf saliency maps as supervision. We also propose a cross-task label refinement mechanism, which takes advantage of the learned class-agnostic masks and semantic segmentation masks, to refine the pseudo labels and provide more accurate supervision to both tasks. Moreover, we introduce a new normalization method for CAM to generate more complete class-specific localization maps. The improved CAM maps complement our learned class-agnostic masks, leading to high-quality pseudo semantic segmentation labels. Extensive experiments demonstrate the effectiveness of the proposed approach, with state-of-the-art WSSS results established on PASCAL VOC 2012 and MS COCO.
Junwei WuMingjie SunHaotian XuChenru JiangWuwei MaQuan Zhang
Xinggang WangJiapei FengBin HuQi DingLongjin RanXiaoxin ChenWenyu Liu
David Minkwan KimSoeun LeeByeongkeun Kang