Jinchao ZhuXiaoyu ZhangXian FangFeng DongYu Qiu
The multi-modal salient object detection model based on RGB-D information has\nbetter robustness in the real world. However, it remains nontrivial to better\nadaptively balance effective multi-modal information in the feature fusion\nphase. In this letter, we propose a novel gated recoding network (GRNet) to\nevaluate the information validity of the two modes, and balance their\ninfluence. Our framework is divided into three phases: perception phase,\nrecoding mixing phase and feature integration phase. First, A perception\nencoder is adopted to extract multi-level single-modal features, which lays the\nfoundation for multi-modal semantic comparative analysis. Then, a\nmodal-adaptive gate unit (MGU) is proposed to suppress the invalid information\nand transfer the effective modal features to the recoding mixer and the hybrid\nbranch decoder. The recoding mixer is responsible for recoding and mixing the\nbalanced multi-modal information. Finally, the hybrid branch decoder completes\nthe multi-level feature integration under the guidance of an optional edge\nguidance stream (OEGS). Experiments and analysis on eight popular benchmarks\nverify that our framework performs favorably against 9 state-of-art methods.\n
Zhengyun ZhaoQingpeng YangShangqin YangJun Wang
Tianyou ChenJin XiaoXiaoguang HuGuofeng ZhangShaojie Wang
Gongyang LiZhi LiuLinwei YeYang WangHaibin Ling