JOURNAL ARTICLE

Salient Object Detection Based on Transformer and Multi-scale Feature Fusion

Abstract

It is difficult to learn global remote semantic information based on convolutional neural network, and it is difficult to obtain multi-scale feature information based on Vision Transformer, Swin Transformer and Pyramid Vision Transformer. However, salient objects maybe involve different scales. This paper introduces Shunted Transformer as the backbone network to extract multi-scale features to achieve salient object detection. Aiming at the problem of ignoring the difference between different features and dilution of high-level features when fusing high-level and low-level features, a decoder for progressive fusion of multi-scale features is designed. In addition, to solve the problem that the boundary features obtained may not match the salient object due to the separation of the boundary prediction structure and the salient object prediction branch, this paper refers to the BIG module and optimizes its feature input. Finally, the validity of the proposed model is verified by experiments on four widely used datasets.

Keywords:
Salient Computer science Artificial intelligence Transformer Pattern recognition (psychology) Convolutional neural network Feature extraction Computer vision Object detection Engineering Voltage

Metrics

2
Cited By
0.36
FWCI (Field Weighted Citation Impact)
19
Refs
0.51
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.