Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Linjiang Huang; Liang Wang; Hongsheng Li

doi:10.1109/iccv48922.2021.00790

ScienceGate Book Chapters

JOURNAL ARTICLE

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Linjiang Huang Liang Wang Hongsheng Li

Year: 2021 Journal: 2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pages: 7982-7991

DOI: 10.1109/iccv48922.2021.00790

Get Full-Text PDF Get Analytical Report

Abstract

As a challenging task of high-level video understanding, weakly supervised temporal action localization has been attracting increasing attention. With only video annotations, most existing methods seek to handle this task with a localization-by-classification framework, which generally adopts a selector to select snippets of high probabilities of actions or namely the foreground. Nevertheless, the existing foreground selection strategies have a major limitation of only considering the unilateral relation from foreground to actions, which cannot guarantee the foreground-action consistency. In this paper, we present a framework named FAC-Net based on the I3D backbone, on which three branches are appended, named class-wise foreground classification branch, class-agnostic attention branch and multiple instance learning branch. First, our class-wise foreground classification branch regularizes the relation between actions and foreground to maximize the foreground-background separation. Besides, the class-agnostic attention branch and multiple instance learning branch are adopted to regularize the foreground-action consistency and help to learn a meaningful foreground classifier. Within each branch, we introduce a hybrid attention mechanism, which calculates multiple attention scores for each snippet, to focus on both discriminative and less-discriminative snippets to capture the full action boundaries. Experimental results on THUMOS14 and ActivityNet1.3 demonstrate the state-of-the-art performance of our method.

Keywords:

Discriminative model Computer science Artificial intelligence Classifier (UML) Consistency (knowledge bases) Snippet Class (philosophy) Margin (machine learning) Machine learning Action recognition Relation (database) Task (project management) Pattern recognition (psychology) Data mining Information retrieval

Metrics

Cited By

5.00

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Abstract

Metrics

Citation History

Topics

Related Documents

Collaborative Foreground, Background, and Action Modeling Network for Weakly Supervised Temporal Action Localization

Action Coherence Network for Weakly Supervised Temporal Action Localization

Action Coherence Network for Weakly-Supervised Temporal Action Localization

Weakly-Supervised Temporal Action Localization with Regional Similarity Consistency

Action-to-Action Diffusion Network for Weakly Supervised Temporal Action Localization