JOURNAL ARTICLE

Progressive Guided Fusion Network With Multi-Modal and Multi-Scale Attention for RGB-D Salient Object Detection

Jiajia WuGuangliang HanHaining WangHang YangQingqing LiDongxu LiuFangjian YePeixun Liu

Year: 2021 Journal:   IEEE Access Vol: 9 Pages: 150608-150622   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The depth map contains abundant spatial structure cues, which makes it extensively introduced into saliency detection tasks for improving the detection accuracy. Nevertheless, the acquired depth map is often with uneven quality, due to the interference of depth sensors and external environments, posing a challenge when trying to minimize the disturbances from low-quality depth maps during the fusion process. In this article, to mitigate such issues and highlight the salient objects, we propose a progressive guided fusion network (PGFNet) with multi-modal and multi-scale attention for RGB-D salient object detection. Particularly, we first present a multi-modal and multi-scale attention fusion model (MMAFM) to fully mine and utilize the complementarity of features at different scales and modalities for achieving optimal fusion. Then, to strengthen the semantic expressiveness of the shallow-layer features, we design a multi-modal feature refinement mechanism (MFRM), which exploits the high-level fusion feature to guide the enhancement of the shallow-layer original RGB and depth features before they are fused. Moreover, a residual prediction module (RPM) is applied to further suppress background elements. Our entire network adopts a top-down strategy to progressively excavate and integrate valuable information. Compared with the state-of-the-art methods, experimental results demonstrate the effectiveness of our proposed method both qualitatively and quantitatively on eight challenging benchmark datasets.

Keywords:
Computer science RGB color model Artificial intelligence Fusion Feature (linguistics) Fusion mechanism Salient Benchmark (surveying) Residual Computer vision Modal Robustness (evolution) Process (computing) Pattern recognition (psychology) Object detection Scale (ratio) Algorithm

Metrics

2
Cited By
0.20
FWCI (Field Weighted Citation Impact)
81
Refs
0.51
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology
© 2026 ScienceGate Book Chapters — All rights reserved.