Rongtao XuChangwei WangJiaxi SunShibiao XuWeiliang MengXiaopeng Zhang
Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high training efficiency. However, current methods suffer from insufficient extraction of comprehensive semantic information, resulting in low-quality pseudo-labels and sub-optimal solutions for end-to-end WSSS. To this end, we propose a simple and novel Self Correspondence Distillation (SCD) method to refine pseudo-labels without introducing external supervision. Our SCD enables the network to utilize feature correspondence derived from itself as a distillation target, which can enhance the network's feature learning process by complementing semantic information. In addition, to further improve the segmentation accuracy, we design a Variation-aware Refine Module to enhance the local consistency of pseudo-labels by computing pixel-level variation. Finally, we present an efficient end-to-end Transformer-based framework (TSCD) via SCD and Variation-aware Refine Module for the accurate WSSS task. Extensive experiments on the PASCAL VOC 2012 and MS COCO 2014 datasets demonstrate that our method significantly outperforms other state-of-the-art methods. Our code is available at https://github.com/Rongtao-Xu/RepresentationLearning/tree/main/SCD-AAAI2023.
Jianjun ChenShancheng FangHongtao XieZheng-Jun ZhaYue HuJianlong Tan
Yue LiuJun ZengX. TaoGang Fang
Bingfeng ZhangJimin XiaoYunchao WeiKaizhu HuangShan LuoYao Zhao
Xiaoyan ShaoJiaqi HanLingling LiXuezhuan ZhaoJingjing Yan
Lei ZhuXinliang ZhangHangzhou HeQian ChenSha LiShuang ZengYibao ZhangQiushi RenYanye Lu