Salient object detection based on weakly supervised learning has become an attractive direction because of high cost of pixel-level labels, but the information provided by a single weakly supervised source makes it difficult to train a well-performance model. In this paper, we propose a pseudo-labels learning method that is designed as a unified two-stage framework for multi-source weakly supervised salient object detection. In the first stage, we introduce a two-branch network to learn category labels and bounding boxes respectively. The first branch is the classification network to obtain class activation maps, and the other branch is based on bounding boxes training to generate salient bounding box attributed maps. Then, we propose an aggregation module, which fuses the maps generated by the above two branches and then refines them to get pixel-level pseudo-labels. In the second stage, we train the transformer model to predict salient maps using the final pseudo-labels as ground-truth. The proposed method is compared with existing methods on six datasets, and experimental results verify the effectiveness of our method.
Yunping ZhengZhou JiangShiqiang ShuYuze ZhuZejun WangMudar Sarem
Runmin CongQi QinChen ZhangQiuping JiangShiqi WangYao ZhaoSam Kwong