JOURNAL ARTICLE

Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection

Yuchao GuLijuan WangZiqin WangYun LiuMing‐Ming ChengShao-Ping Lu

Year: 2020 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 34 (07)Pages: 10869-10876   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

Spatiotemporal information is essential for video salient object detection (VSOD) due to the highly attractive object motion for human's attention. Previous VSOD methods usually use Long Short-Term Memory (LSTM) or 3D ConvNet (C3D), which can only encode motion information through step-by-step propagation in the temporal domain. Recently, the non-local mechanism is proposed to capture long-range dependencies directly. However, it is not straightforward to apply the non-local mechanism into VSOD, because i) it fails to capture motion cues and tends to learn motion-independent global contexts; ii) its computation and memory costs are prohibitive for video dense prediction tasks such as VSOD. To address the above problems, we design a Constrained Self-Attention (CSA) operation to capture motion cues, based on the prior that objects always move in a continuous trajectory. We group a set of CSA operations in Pyramid structures (PCSA) to capture objects at various scales and speeds. Extensive experimental results demonstrate that our method outperforms previous state-of-the-art methods in both accuracy and speed (110 FPS on a single Titan Xp) on five challenge datasets. Our code is available at https://github.com/guyuchao/PyramidCSA.

Keywords:
Computer science Artificial intelligence Salient Computer vision Pyramid (geometry) Motion (physics) Trajectory ENCODE Code (set theory) Object detection Set (abstract data type) Computation Object (grammar) Pattern recognition (psychology) Algorithm

Metrics

159
Cited By
8.08
FWCI (Field Weighted Citation Impact)
57
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Gaze Tracking and Assistive Technology
Physical Sciences →  Computer Science →  Human-Computer Interaction
© 2026 ScienceGate Book Chapters — All rights reserved.