Jinchao ZhuXiaoyu ZhangFeng DongSiyu YanXianbang MengYuehua LiPanlong Tan
RGB-Thermal salient object detection (RGB-T SOD) aims to better segment the most salient objects with the cooperation of visual and thermal infrared images. The addition of thermal infrared images helps to improve the accuracy of robot decision-making when performing complex visual tasks. How to exploit the potential of multi-modal complementarity, tap the dominant modal information, and better complete object location is still a problem worthy of exploration. In this paper, we propose an adaptive interaction promotion network (AIPNet). In specific, we design a modal interaction module (MIM) with two parallel units to fuse the two modal features extracted by the encoders. The spatial interaction unit (SIU) is responsible for directly completing modal interaction and integration. The self-reinforcement unit (SRU) is responsible for enhancing two single-mode features and amplifying the role of dominant modal features. Besides, we use a query-location module (QLM) for high-level features to accurately confirm the location of salient objects. Finally, we adopt a re-calibration dual branch decoder (RCDB) to integrate the output features. Sufficient experiments conducted on RGB-T and RGB-D SOD datasets demonstrate that the proposed method performs favorably against the other 13 state-of-the-art methods.
Feng DongYuxuan WangJinchao ZhuYuehua Li
Chengtao LvXiaofei ZhouBin WanShuai WangYaoqi SunJiyong ZhangChenggang Yan
Yuxuan WangFeng DongJinchao ZhuJianren Chen
Chao ZengSam KwongHorace H. S. Ip
Jincheng LuoYongjun LiBo LiXinru ZhangC. LiZhimin ChenjinJingyi HeYifei Liang