JOURNAL ARTICLE

Mutual-Guidance Transformer-Embedding Network for Video Salient Object Detection

Dingyao MinChao ZhangYukang LuKeren FuQijun Zhao

Year: 2022 Journal:   IEEE Signal Processing Letters Vol: 29 Pages: 1674-1678   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Video salient object detection (VSOD) aims at locating the most attractive objects presented in video sequences by exploiting spatial and temporal cues. Previous methods mainly utilize convolutional neural networks (CNNs) to fuse or complement across RGB and optical flow cues via simple strategies. To take full advantage of CNNs and recently emerged Transformers, this letter proposes a novel mutual-guidance Transformer-embedding network, called MGT-Net, where a mutual-guidance multi-head attention mechanism (MGMA) explores more sophisticated long-range cross-modal interactions. Such a mechanism is designed into a new mutual-guidance Transformer (MGTrans) module that can propagate long-range contextual dependencies based on information of the other modality. To the best of our knowledge, MGT-Net is the first VSOD model that embeds Transformers as modules into CNNs for improved performance. Prior to MGTrans, we also propose and deploy a feature purification module (FPM) to purify noisy backbone features. Experimental results on five benchmark datasets demonstrate the state-of-the-art performance of MGT-Net.

Keywords:
Computer science Transformer Artificial intelligence Embedding Convolutional neural network Optical flow Mutual information Pattern recognition (psychology) Computer vision Object detection Engineering Voltage

Metrics

15
Cited By
1.86
FWCI (Field Weighted Citation Impact)
46
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Transformer-based Cross Reference Network for video salient object detection

Kan HuangChunwei TianJingyong SuJerry Chun‐Wei Lin

Journal:   Pattern Recognition Letters Year: 2022 Vol: 160 Pages: 122-127
JOURNAL ARTICLE

Bidirectional mutual guidance transformer for salient object detection in optical remote sensing images

Kan HuangChunwei TianGe Li

Journal:   International Journal of Remote Sensing Year: 2023 Vol: 44 (13)Pages: 4016-4033
JOURNAL ARTICLE

TENet: Accurate light-field salient object detection with a transformer embedding network

Xingzheng WangSongwei ChenGuoyao WeiJiehao Liu

Journal:   Image and Vision Computing Year: 2022 Vol: 129 Pages: 104595-104595
JOURNAL ARTICLE

STEG-Net: Spatiotemporal Edge Guidance Network for Video Salient Object Detection

Hongbo BiLina YangHuihui ZhuDi LuJianguo Jiang

Journal:   IEEE Transactions on Cognitive and Developmental Systems Year: 2021 Vol: 14 (3)Pages: 902-915
© 2026 ScienceGate Book Chapters — All rights reserved.