JOURNAL ARTICLE

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Abstract

Weakly-supervised temporal action localization (WS-TAL) aims to localize actions in untrimmed videos with only video-level labels. Most existing models follow the "localization by classification" procedure: locate temporal regions contributing most to the video-level classification. Generally, they process each snippet (or frame) individually and thus overlook the fruitful temporal context relation. Here arises the single snippet cheating issue: "hard" snippets are too vague to be classified. In this paper, we argue that learning by comparing helps identify these hard snip-pets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short. Specifically, we propose a Snippet Contrast (SniCo) Loss to refine the hard snippet representation in feature space, which guides the network to perceive precise temporal boundaries and avoid the temporal interval interruption. Besides, since it is in-feasible to access frame-level annotations, we introduce a Hard Snippet Mining algorithm to locate the potential hard snippets. Substantial analyses verify that this mining strategy efficaciously captures the hard snippets and SniCo Loss leads to more informative feature representation. Extensive experiments show that CoLA achieves state-of-the-art results on THUMOS'14 and ActivityNet v1.2 datasets.

Keywords:
Snippet Cola (plant) Computer science Artificial intelligence Action (physics) Natural language processing Information retrieval

Metrics

153
Cited By
13.19
FWCI (Field Weighted Citation Impact)
72
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Gait Recognition and Analysis
Physical Sciences →  Engineering →  Biomedical Engineering

Related Documents

JOURNAL ARTICLE

Snippet-to-Prototype Contrastive Consensus Network for Weakly Supervised Temporal Action Localization

Yuxiang ShaoFeifei ZhangChangsheng Xu

Journal:   IEEE Transactions on Multimedia Year: 2024 Vol: 26 Pages: 6717-6729
BOOK-CHAPTER

Weakly Supervised Temporal Action Localization Through Segment Contrastive Learning

Zihao JiangYidong Li

Communications in computer and information science Year: 2023 Pages: 228-243
JOURNAL ARTICLE

Weakly Supervised Temporal Action Localization With Contrastive Learning-Based Action Salience Network

Jingtao SunShi WeipengHao ShaoyangFusheng Li

Journal:   The European Journal on Artificial Intelligence Year: 2025 Vol: 38 (1)Pages: 64-78
JOURNAL ARTICLE

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization

Junyu GaoMengyuan ChenChangsheng Xu

Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Year: 2022 Pages: 19967-19977
© 2026 ScienceGate Book Chapters — All rights reserved.