A prompt tuning method for few-shot action recognition

Shu Yang; Yali Li; Shengjin Wang

doi:10.1109/vcip59821.2023.10402721

ScienceGate Book Chapters

JOURNAL ARTICLE

A prompt tuning method for few-shot action recognition

Shu Yang Yali Li Shengjin Wang

Year: 2023

DOI: 10.1109/vcip59821.2023.10402721

Get Full-Text PDF Get Analytical Report

Abstract

Vision-language pre-training models learn visual concepts from image-text or video-text pairs, which can be adopted for visual-textual tasks. In this paper, we adopt these concepts as prior knowledge to solve the unreliable problem of minimizing the loss of limited training samples in few-shot action recognition tasks. In particular, a two-stage framework of vision-language pre-training and prompt tuning is designed. In the pre-training stage, multi-modal encoding models are jointly trained on video-text pairs to learn the semantic correspondence between video and text. In the prompt tuning stage, a prompt module with instance-level bias is trained on a few video samples to utilize the pre-trained concepts for the classification task. The experimental results show that the proposed method is superior to the baseline and state-of-the-art few-shot action recognition methods on two public video benchmarks.

Keywords:

Computer science Shot (pellet) Action (physics) Action recognition Artificial intelligence Materials science Physics Class (philosophy)

Metrics

Cited By

0.18

FWCI (Field Weighted Citation Impact)

Refs

0.47

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Gait Recognition and Analysis

Physical Sciences → Engineering → Biomedical Engineering

A prompt tuning method for few-shot action recognition

Abstract

Metrics

Citation History

Topics

Related Documents

DesPrompt: Personality-descriptive prompt tuning for few-shot personality recognition

Cross-language few-shot intent recognition via prompt-based tuning

ZAR: Zero-shot Action Recognition with Dynamic Prompt Tuning

Efficient Cross-Task Prompt Tuning for Few-Shot Conversational Emotion Recognition

A Hybrid Prompt Method for Few-Shot Named Entity Recognition