RGB-D Saliency Detection with 3D Cross-modal Fusion and Mid-level Integration

Taoqi Liu; Bo Li

doi:10.1109/ictai56018.2022.00201

ScienceGate Book Chapters

JOURNAL ARTICLE

RGB-D Saliency Detection with 3D Cross-modal Fusion and Mid-level Integration

Taoqi Liu Bo Li

Year: 2022 Pages: 1328-1335

DOI: 10.1109/ictai56018.2022.00201

Get Full-Text PDF Get Analytical Report

Abstract

In recent years, many salient object detection (SOD) methods introduce depth cues to boost detection performance in challenging scenes, named as RGB-D SOD. However, how to effectively fuse cross-modal features with various properties (i.e., RGB and depth) has become a key issue that is hard to be avoided. Most existing methods employ simple operations, such as concatenation or summation, for cross-modal fusion, ignoring the negative effects of low-quality depth maps, thus yielding poor performance. In this paper, we design a simple yet effective fusion method, which utilizes 3D convolution to extract modality-specific and modality-shared information respectively for sufficient cross-modal fusion, and combines modality weights to mitigate the interference of invalid information. In addition, we propose a novel multi-level feature integration strategy in the decoder, which explicitly incorporates the low-level detail information and high-level semantic information into the mid-level to generate accurate saliency maps. Extensive experiments on six public datasets show that our method achieves competitive results compared to 17 state-of-the-art methods.

Keywords:

Concatenation (mathematics) Computer science Fuse (electrical) Artificial intelligence RGB color model Modal Modality (human–computer interaction) Convolution (computer science) Fusion Key (lock) Feature (linguistics) Computer vision Pattern recognition (psychology) Mathematics Artificial neural network Engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.17

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Olfactory and Sensory Function Studies

Life Sciences → Neuroscience → Sensory Systems

RGB-D Saliency Detection with 3D Cross-modal Fusion and Mid-level Integration

Abstract

Metrics

Topics

Related Documents

Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection

RGB-D Saliency Detection based on Cross-Modal and Multi-scale Feature Fusion

RGB-D Saliency Detection Based on Attention Mechanism and Multi-Scale Cross-Modal Fusion

Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection

AGRFNet: Two-stage cross-modal and multi-level attention gated recurrent fusion network for RGB-D saliency detection