JOURNAL ARTICLE

Discriminative Action Snippet Propagation Network for Weakly Supervised Temporal Action Localization

Yuanjie DangHuang Chun-xiaPeng ChenDongdong ZhaoNan GaoRonghua LiangRuohong Huan

Year: 2024 Journal:   ACM Transactions on Multimedia Computing Communications and Applications Vol: 20 (6)Pages: 1-21   Publisher: Association for Computing Machinery

Abstract

Weakly supervised temporal action localization (WTAL) aims to classify and localize actions in untrimmed videos with only video-level labels. Recent studies have attempted to obtain more accurate temporal boundaries by exploiting latent action instances in ambiguous snippets or propagating representative action features. However, empirically handcrafted ambiguous snippet extraction and the imprecise alignment of representative snippet propagation lead to challenges in modeling the completeness of actions for these methods. In this article, we propose a Discriminative Action Snippet Propagation Network (DASP-Net) to accurately discover ambiguous snippets in videos and propagate discriminative instance-level features throughout the video for improving action completeness. Specifically, we introduce a novel discriminative feature propagation module for capturing the global contextual attention and propagating the action concept across the whole video by perceiving the discriminative action snippets with instance information from the same video. Simultaneously, we incorporate denoised pseudo-labels as supervision, where we correct the controversial prediction based on the feature space distribution during training, thereby alleviating false detection caused by noise background features. Furthermore, we design an ambiguous feature mining module, which maximizes the feature affinity information of action and background in ambiguous snippets to generate more accurate latent action and background snippets and learns more precise action instance boundaries through contrastive learning of action and background snippets. Extensive experiments show that DASP-Net achieves state-of-the-art results on THUMOS14 and ActivityNet1.2 datasets.

Keywords:
Snippet Discriminative model Action (physics) Computer science Artificial intelligence Pattern recognition (psychology) Physics Information retrieval

Metrics

3
Cited By
1.59
FWCI (Field Weighted Citation Impact)
33
Refs
0.72
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Gait Recognition and Analysis
Physical Sciences →  Engineering →  Biomedical Engineering

Related Documents

JOURNAL ARTICLE

Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation

Linjiang HuangLiang WangHongsheng Li

Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Year: 2022
JOURNAL ARTICLE

Deep snippet selective network for weakly supervised temporal action localization

Yongxin GeXiaolei QinDan YangMartin Jägersand

Journal:   Pattern Recognition Year: 2020 Vol: 110 Pages: 107686-107686
JOURNAL ARTICLE

Snippet-to-Prototype Contrastive Consensus Network for Weakly Supervised Temporal Action Localization

Yuxiang ShaoFeifei ZhangChangsheng Xu

Journal:   IEEE Transactions on Multimedia Year: 2024 Vol: 26 Pages: 6717-6729
JOURNAL ARTICLE

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wulian YunMengshi QiChuanming WangHuadóng Ma

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2024 Vol: 38 (7)Pages: 6908-6916
© 2026 ScienceGate Book Chapters — All rights reserved.