JOURNAL ARTICLE

Dynamic Graph Modeling for Weakly-Supervised Temporal Action Localization

Haichao ShiXiaoyu ZhangChangsheng LiLixing GongYong LiYongjun Bao

Year: 2022 Journal:   Proceedings of the 30th ACM International Conference on Multimedia Pages: 3820-3828

Abstract

Weakly supervised action localization is a challenging task that aims to localize action instances in untrimmed videos given only video-level supervision. Existing methods mostly distinguish action from background via attentive feature fusion with RGB and optical flow modalities. Unfortunately, this strategy fails to retain the distinct characteristics of each modality, leading to inaccurate localization under hard-to-discriminate cases such as action-context interference and in-action stationary period. As an action is typically comprised of multiple stages, an intuitive solution is to model the relation between the finer-grained action segments to obtain a more detailed analysis. In this paper, we propose a dynamic graph-based method, namely DGCNN, to explore the two-stream relation between action segments. To be specific, segments within a video which are likely to be actions are dynamically selected to construct an action graph. For each graph, a triplet adjacency matrix is devised to explore the temporal and contextual correlations between the pseudo action segments, which consists of three components, i.e., mutual importance, feature similarity, and high-level contextual similarity. The two-stream dynamic pseudo graphs, along with the pseudo background segments, are used to derive more detailed video representation. For action localization, a non-local based temporal refinement module is proposed to fully leverage the temporal consistency between consecutive segments. Experimental results on three datasets, i.e., THUMOS14, ActivityNet v1.2 and v1.3, demonstrate that our method is superior to the state-of-the-arts.

Keywords:
Computer science Artificial intelligence Adjacency matrix Pattern recognition (psychology) Leverage (statistics) Graph Optical flow Context (archaeology) Theoretical computer science Image (mathematics)

Metrics

28
Cited By
1.93
FWCI (Field Weighted Citation Impact)
26
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization

Zichen YangJie QinDi Huang

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2022 Vol: 36 (3)Pages: 3090-3098
JOURNAL ARTICLE

Modeling Sub-Actions for Weakly Supervised Temporal Action Localization

Linjiang HuangYan HuangWanli OuyangLiang Wang

Journal:   IEEE Transactions on Image Processing Year: 2021 Vol: 30 Pages: 5154-5167
JOURNAL ARTICLE

StochasticFormer: Stochastic Modeling for Weakly Supervised Temporal Action Localization

Haichao ShiXiaoyu ZhangChangsheng Li

Journal:   IEEE Transactions on Image Processing Year: 2023 Vol: 32 Pages: 1379-1389
© 2026 ScienceGate Book Chapters — All rights reserved.