JOURNAL ARTICLE

Adaptive Prototype Learning for Weakly-Supervised Temporal Action Localization

Wang LuoHuan RenTianzhu ZhangWenfei YangYongdong Zhang

Year: 2024 Journal:   IEEE Transactions on Image Processing Vol: 34 Pages: 3154-3168   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Weakly-supervised Temporal Action Localization (WTAL) aims to localize action instances with only video-level labels during training, where two primary issues are localization incompleteness and background interference. To relieve these two issues, recent methods adopt an attention mechanism to activate action instances and simultaneously suppress background ones, which have achieved remarkable progress. Nevertheless, we argue that these two issues have not been well resolved yet. On the one hand, the attention mechanism adopts fixed weights for different videos, which are incapable of handling the diversity of different videos, thus deficient in addressing the problem of localization incompleteness. On the other hand, previous methods only focus on learning the foreground attention and the attention weights usually suffer from ambiguity, resulting in difficulty of suppressing background interference. To deal with the above issues, in this paper we propose an Adaptive Prototype Learning (APL) method for WTAL, which includes two key designs: 1) an Adaptive Transformer Network (ATN) to explicitly model background and learn video-adaptive prototypes for each specific video; 2) an OT-based Collaborative (OTC) training strategy to guide the learning of prototypes and remove the ambiguity of the foreground-background separation by introducing an Optimal Transport (OT) algorithm into the collaborative training scheme between RGB and FLOW streams. These two key designs can work together to learn video-adaptive prototypes and solve the above two issues, achieving robust localization. Extensive experimental results on two standard benchmarks (THUMOS14 and ActivityNet) demonstrate that our proposed APL performs favorably against state-of-the-art methods.

Keywords:
Computer science Ambiguity Artificial intelligence Machine learning Key (lock)

Metrics

1
Cited By
0.53
FWCI (Field Weighted Citation Impact)
97
Refs
0.55
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Stroke Rehabilitation and Recovery
Health Sciences →  Medicine →  Rehabilitation
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Ensemble Prototype Network For Weakly Supervised Temporal Action Localization

Kewei WuWenjie LuoZhao XieDan GuoZhao ZhangRichang Hong

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2024 Vol: 36 (3)Pages: 4560-4574
© 2026 ScienceGate Book Chapters — All rights reserved.