Few-shot Action Recognition with Video Transformer

Nartay Aikyn; Assanali Abu; Tomiris Zhaksylyk; Nguyen Anh Tu

doi:10.1109/sitis61268.2023.00027

ScienceGate Book Chapters

JOURNAL ARTICLE

Few-shot Action Recognition with Video Transformer

Nartay Aikyn Assanali Abu Tomiris Zhaksylyk Nguyen Anh Tu

Year: 2023 Vol: 29 Pages: 122-129

DOI: 10.1109/sitis61268.2023.00027

Get Full-Text PDF Get Analytical Report

Abstract

This paper proposes a novel few-shot action recognition framework that integrates the Transformer-based feature backbone into meta-learning. The proposed method includes pre-training the Video Transformer and utilizing metric-based meta-learning with the ProtoNet algorithm. Extensive experiments on benchmark datasets demonstrate that our approach achieves remarkable performance, surpassing baseline models and obtaining competitive results compared to state-of-the-art models. Additionally, we investigate the impact of supervised and self-supervised learning on video representation and evaluate the transferability of the learned representations in cross-domain scenarios. Our approach suggests a promising direction for exploring the combination of meta-learning with Video Transformer in the context of few-shot learning tasks, potentially contributing to the field of action recognition in various domains.

Keywords:

Computer science Transformer Action recognition Shot (pellet) Artificial intelligence Computer vision Pattern recognition (psychology) Speech recognition Engineering Electrical engineering Voltage Materials science

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.22

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Few-shot Action Recognition with Video Transformer

Abstract

Metrics

Topics

Related Documents

Zero-Shot Action Recognition with Transformer-based Video Semantic Embedding

Semantic-aware Video Representation for Few-shot Action Recognition

Saliency Based Data Augmentation for Few-Shot Video Action Recognition

Temporal Alignment-Free Video Matching for Few-Shot Action Recognition

Benchmarking Federated Few-Shot Learning for Video-Based Action Recognition