Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

Yuyang Wanyan; Xiaoshan Yang; Chaofan Chen; Changsheng Xu

doi:10.1109/cvpr52729.2023.00628

ScienceGate Book Chapters

JOURNAL ARTICLE

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

Yuyang Wanyan Xiaoshan Yang Chaofan Chen Changsheng Xu

Year: 2023 Pages: 6492-6502

DOI: 10.1109/cvpr52729.2023.00628

Get Full-Text PDF Get Analytical Report

Abstract

Recently, few-shot action recognition receives increasing attention and achieves remarkable progress. However, previous methods mainly rely on limited unimodal data (e.g., RGB frames) while the multimodal information remains relatively underexplored. In this paper, we propose a novel Active Multimodal Few-shot Action Recognition (AMFAR) framework, which can actively find the reliable modality for each sample based on task-dependent context information to improve few-shot reasoning procedure. In meta-training, we design an Active Sample Selection (ASS) module to organize query samples with large differences in the reliability of modalities into different groups based on modality-specific posterior distributions. In addition, we design an Active Mutual Distillation (AMD) to capture discriminative task-specific knowledge from the reliable modality to improve the representation learning of unreliable modality by bidirectional knowledge distillation. In meta-test, we adopt Adaptive Multimodal Inference (AMI) to adaptively fuse the modality-specific posterior distributions with a larger weight on the reliable modality. Extensive experimental results on four public benchmarks demonstrate that our model achieves significant improvements over existing unimodal and multimodal methods.

Keywords:

Computer science Artificial intelligence Modality (human–computer interaction) Discriminative model Machine learning Inference Modalities Reinforcement learning Context (archaeology)

Metrics

Cited By

7.82

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Active Multimodal Distillation for Few-shot Action Recognition

Active Multimodal Distillation for Few-shot Action Recognition

Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition

Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition

Few-shot Egocentric Multimodal Activity Recognition