SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

Zhengyi Liu; Yacheng Tan; Qian He; Yun Xiao

doi:10.1109/tcsvt.2021.3127149

ScienceGate Book Chapters

JOURNAL ARTICLE

SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

Zhengyi Liu Yacheng Tan Qian He Yun Xiao

Year: 2021 Journal: IEEE Transactions on Circuits and Systems for Video Technology Vol: 32 (7)Pages: 4486-4497 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tcsvt.2021.3127149

Get Full-Text PDF Get Analytical Report

Abstract

Convolutional neural networks (CNNs) are good at extracting contexture\nfeatures within certain receptive fields, while transformers can model the\nglobal long-range dependency features. By absorbing the advantage of\ntransformer and the merit of CNN, Swin Transformer shows strong feature\nrepresentation ability. Based on it, we propose a cross-modality fusion model\nSwinNet for RGB-D and RGB-T salient object detection. It is driven by Swin\nTransformer to extract the hierarchical features, boosted by attention\nmechanism to bridge the gap between two modalities, and guided by edge\ninformation to sharp the contour of salient object. To be specific, two-stream\nSwin Transformer encoder first extracts multi-modality features, and then\nspatial alignment and channel re-calibration module is presented to optimize\nintra-level cross-modality features. To clarify the fuzzy boundary, edge-guided\ndecoder achieves inter-level cross-modality fusion under the guidance of edge\nfeatures. The proposed model outperforms the state-of-the-art models on RGB-D\nand RGB-T datasets, showing that it provides more insight into the\ncross-modality complementarity task.\n

Keywords:

Computer vision RGB color model Artificial intelligence Computer science Edge detection Transformer Image processing Engineering Image (mathematics) Voltage Electrical engineering

Metrics

320

Cited By

23.82

FWCI (Field Weighted Citation Impact)

119

Refs

1.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

Abstract

Metrics

Citation History

Topics

Related Documents

Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection

EM-Trans: Edge-Aware Multimodal Transformer for RGB-D Salient Object Detection

EATNet: edge-aware and transformer-based network for RGB-D salient object detection

PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection

Dual Swin-transformer based mutual interactive network for RGB-D salient object detection