JOURNAL ARTICLE

RGB-D Saliency Detection with 3D Cross-modal Fusion and Mid-level Integration

Abstract

In recent years, many salient object detection (SOD) methods introduce depth cues to boost detection performance in challenging scenes, named as RGB-D SOD. However, how to effectively fuse cross-modal features with various properties (i.e., RGB and depth) has become a key issue that is hard to be avoided. Most existing methods employ simple operations, such as concatenation or summation, for cross-modal fusion, ignoring the negative effects of low-quality depth maps, thus yielding poor performance. In this paper, we design a simple yet effective fusion method, which utilizes 3D convolution to extract modality-specific and modality-shared information respectively for sufficient cross-modal fusion, and combines modality weights to mitigate the interference of invalid information. In addition, we propose a novel multi-level feature integration strategy in the decoder, which explicitly incorporates the low-level detail information and high-level semantic information into the mid-level to generate accurate saliency maps. Extensive experiments on six public datasets show that our method achieves competitive results compared to 17 state-of-the-art methods.

Keywords:
Concatenation (mathematics) Computer science Fuse (electrical) Artificial intelligence RGB color model Modal Modality (human–computer interaction) Convolution (computer science) Fusion Key (lock) Feature (linguistics) Computer vision Pattern recognition (psychology) Mathematics Artificial neural network Engineering

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
44
Refs
0.17
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Olfactory and Sensory Function Studies
Life Sciences →  Neuroscience →  Sensory Systems

Related Documents

JOURNAL ARTICLE

Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection

Di LiuKao ZhangZhenzhong Chen

Journal:   IEEE Transactions on Multimedia Year: 2020 Vol: 23 Pages: 967-981
JOURNAL ARTICLE

RGB-D Saliency Detection based on Cross-Modal and Multi-scale Feature Fusion

Xuxing ZhuJin WuLei Zhu

Journal:   2022 34th Chinese Control and Decision Conference (CCDC) Year: 2022 Pages: 6154-6160
JOURNAL ARTICLE

RGB-D Saliency Detection Based on Attention Mechanism and Multi-Scale Cross-Modal Fusion

Zhiqiang CuiZhengyong FengFeng WangQiang Liu

Journal:   Journal of Computer-Aided Design & Computer Graphics Year: 2023 Vol: 35 (6)Pages: 803-902
JOURNAL ARTICLE

Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection

Qinsheng DuYingxu BianJianyu WuShiyan ZhangJian Zhao

Journal:   Applied Sciences Year: 2024 Vol: 14 (17)Pages: 7440-7440
JOURNAL ARTICLE

AGRFNet: Two-stage cross-modal and multi-level attention gated recurrent fusion network for RGB-D saliency detection

Zhengyi LiuYuan WangYacheng TanWei LiYun Xiao

Journal:   Signal Processing Image Communication Year: 2022 Vol: 104 Pages: 116674-116674
© 2026 ScienceGate Book Chapters — All rights reserved.