JOURNAL ARTICLE

Action Coherence Network for Weakly Supervised Temporal Action Localization

Abstract

Most prominent temporal action localization methods are of the fully-supervised type, which rely heavily on frame-level labels, which could be prohibitively expensive to annotate. Thanks to recent developments on the Weakly-supervised Temporal Action Localization (W-TAL), this alternative paradigm requires only video-level labels in training, alleviating such annotation efforts. Specifically, we present Action Coherence Network (ACN) for W-TAL, which features a new coherence loss that better supervises action boundary learning and facilitate proposal regression. In addition, a purpose-built fusion module is proposed for localization inference based on features extracted by two streams of convolutional neural network. Overall, the proposed ACN achieves state-of-the-art W-TAL performance on two challenging datasets (THU-MOS14 and ActivityNet1.2, particularly ACN attains mAP of 24.2% on THUMOS14 under IoU threshold 0.5), which is approaching some recent fully-supervised TAL methods.

Keywords:
Computer science Coherence (philosophical gambling strategy) Artificial intelligence Convolutional neural network Inference Annotation Supervised learning Action recognition Pattern recognition (psychology) Action (physics) Boundary (topology) Deep learning Machine learning Frame (networking) Artificial neural network Mathematics Class (philosophy)

Metrics

17
Cited By
1.50
FWCI (Field Weighted Citation Impact)
33
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.