Transformer-based Adaptive Interactive Promotion Network for RGB-T Salient Object Detection

Jinchao Zhu; Xiaoyu Zhang; Feng Dong; Siyu Yan; Xianbang Meng; Yuehua Li; Panlong Tan

doi:10.1109/ccdc55256.2022.10034159

ScienceGate Book Chapters

JOURNAL ARTICLE

Transformer-based Adaptive Interactive Promotion Network for RGB-T Salient Object Detection

Jinchao Zhu Xiaoyu Zhang Feng Dong Siyu Yan Xianbang Meng Yuehua Li Panlong Tan

Year: 2022 Journal: 2022 34th Chinese Control and Decision Conference (CCDC) Pages: 1989-1994

DOI: 10.1109/ccdc55256.2022.10034159

Get Full-Text PDF Get Analytical Report

Abstract

RGB-Thermal salient object detection (RGB-T SOD) aims to better segment the most salient objects with the cooperation of visual and thermal infrared images. The addition of thermal infrared images helps to improve the accuracy of robot decision-making when performing complex visual tasks. How to exploit the potential of multi-modal complementarity, tap the dominant modal information, and better complete object location is still a problem worthy of exploration. In this paper, we propose an adaptive interaction promotion network (AIPNet). In specific, we design a modal interaction module (MIM) with two parallel units to fuse the two modal features extracted by the encoders. The spatial interaction unit (SIU) is responsible for directly completing modal interaction and integration. The self-reinforcement unit (SRU) is responsible for enhancing two single-mode features and amplifying the role of dominant modal features. Besides, we use a query-location module (QLM) for high-level features to accurately confirm the location of salient objects. Finally, we adopt a re-calibration dual branch decoder (RCDB) to integrate the output features. Sufficient experiments conducted on RGB-T and RGB-D SOD datasets demonstrate that the proposed method performs favorably against the other 13 state-of-the-art methods.

Keywords:

Computer science RGB color model Artificial intelligence Computer vision Salient Modal Pattern recognition (psychology)

Metrics

Cited By

0.35

FWCI (Field Weighted Citation Impact)

Refs

0.66

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Olfactory and Sensory Function Studies

Life Sciences → Neuroscience → Sensory Systems

Gaze Tracking and Assistive Technology

Physical Sciences → Computer Science → Human-Computer Interaction

Transformer-based Adaptive Interactive Promotion Network for RGB-T Salient Object Detection

Abstract

Metrics

Citation History

Topics

Related Documents

Adaptive interactive network for RGB-T salient object detection with double mapping transformer

Transformer-Based Cross-Modal Integration Network for RGB-T Salient Object Detection

Interactive context-aware network for RGB-T salient object detection

Dual Swin-transformer based mutual interactive network for RGB-D salient object detection

Transformer-based cross-modality interaction guidance network for RGB-T salient object detection