Visual Saliency Prediction Using Attention-based Cross-modal Integration Network in RGB-D Images

Xinyue Zhang; Ting Jin; Mingjie Han; Jingsheng Lei; Zhichao Cao

doi:10.32604/iasc.2021.018643

ScienceGate Book Chapters

JOURNAL ARTICLE

Visual Saliency Prediction Using Attention-based Cross-modal Integration Network in RGB-D Images

Xinyue Zhang Ting Jin Mingjie Han Jingsheng Lei Zhichao Cao

Year: 2021 Journal: Intelligent Automation & Soft Computing Vol: 29 (3)Pages: 439-452 Publisher: Taylor & Francis

DOI: 10.32604/iasc.2021.018643

Get Full-Text PDF Get Analytical Report

Abstract

Saliency prediction has recently gained a large number of attention for the sake of the rapid development of deep neural networks in computer vision tasks. However, there are still dilemmas that need to be addressed. In this paper, we design a visual saliency prediction model using attention-based cross-model integration strategies in RGB-D images. Unlike other symmetric feature extraction networks, we exploit asymmetric networks to effectively extract depth features as the complementary information of RGB information. Then we propose attention modules to integrate cross-modal feature information and emphasize the feature representation of salient regions, meanwhile neglect the surrounding unimportant pixels, so as to reduce the lost of channel details during the feature extraction. Moreover, we contribute successive dilated convolution modules to reduce training parameters and to attain multi-scale reception fields by using dilated convolution layers, also, the successive dilated convolution modules can promote the interaction of two complementary information. Finally, we build the decoder process to explore the continuity and attributes of different levels of enhanced features by gradually concatenating outputs of proposed modules and obtaining final high-quality saliency prediction maps. Experimental results on two widely-agreed datasets demonstrate that our model outperforms than other six state-of-the-art saliency models according to four measure metrics.

Keywords:

Computer science RGB color model Convolution (computer science) Artificial intelligence Feature (linguistics) Process (computing) Feature extraction Pattern recognition (psychology) Representation (politics) Pixel Modal Salient Artificial neural network Computer vision

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.10

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image and Video Quality Assessment

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image Fusion Techniques

Physical Sciences → Engineering → Media Technology

Visual Saliency Prediction Using Attention-based Cross-modal Integration Network in RGB-D Images

Abstract

Metrics

Topics

Related Documents

Cross-Modal Feature Integration Network for Human Eye-Fixation Prediction in RGB-D Images

Attention-based contextual interaction asymmetric network for RGB-D saliency prediction

Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection

Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection

RGB-D Saliency Detection Based on Attention Mechanism and Multi-Scale Cross-Modal Fusion