JOURNAL ARTICLE

Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization

Guozhang LiJie LiNannan WangXinpeng DingZhifeng LiXinbo Gao

Year: 2021 Journal:   IEEE Transactions on Image Processing Vol: 30 Pages: 9332-9344   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Weakly Supervised Temporal Action Localization (WTAL) aims to localize action segments in untrimmed videos with only video-level category labels in the training phase. In WTAL, an action generally consists of a series of sub-actions, and different categories of actions may share the common sub-actions. However, to distinguish different categories of actions with only video-level class labels, current WTAL models tend to focus on discriminative sub-actions of the action, while ignoring those common sub-actions shared with different categories of actions. This negligence of common sub-actions would lead to the located action segments incomplete, i.e., only containing discriminative sub-actions. Different from current approaches of designing complex network architectures to explore more complete actions, in this paper, we introduce a novel supervision method named multi-hierarchical category supervision (MHCS) to find more sub-actions rather than only the discriminative ones. Specifically, action categories sharing similar sub-actions will be constructed as super-classes through hierarchical clustering. Hence, training with the new generated super-classes would encourage the model to pay more attention to the common sub-actions, which are ignored training with the original classes. Furthermore, our proposed MHCS is model-agnostic and non-intrusive, which can be directly applied to existing methods without changing their structures. Through extensive experiments, we verify that our supervision method can improve the performance of four state-of-the-art WTAL methods on three public datasets: THUMOS14, ActivityNet1.2, and ActivityNet1.3.

Keywords:
Discriminative model Computer science Artificial intelligence Action (physics) Class (philosophy) Machine learning Hierarchical clustering Cluster analysis Pattern recognition (psychology) Natural language processing

Metrics

12
Cited By
1.12
FWCI (Field Weighted Citation Impact)
64
Refs
0.80
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.