JOURNAL ARTICLE

Context Driven Network with Bayes for Weakly Supervised Temporal Action Localization

Abstract

Weakly supervised temporal action localization (WTAL) aims to detect action instances from untrimmed videos. It may cause two problems, namely action incompleteness and background disturbance, due to only video-level class labels given. In this paper, we propose a context driven network with Bayes to alleviate the two problems, in which an attention module is used to predict coarse probability for each snippet, and then a Bayesian refinement module is designed to refine the coarse results by capturing the relationship between context snippets. Finally, the coarse and refined probabilities are combined as the inputs of the classifier for training. Quantitative and qualitative studies on two benchmark datasets, i.e., THUMOS'14 and ActivityNet 1.2, demonstrate that the proposed approach exceeds state-of-the-art methods.

Keywords:
Snippet Computer science Artificial intelligence Machine learning Classifier (UML) Benchmark (surveying) Bayesian probability Context (archaeology) Bayes' theorem Naive Bayes classifier Class (philosophy) Action (physics) Bayesian network Pattern recognition (psychology) Support vector machine Information retrieval

Metrics

1
Cited By
0.10
FWCI (Field Weighted Citation Impact)
22
Refs
0.37
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.