JOURNAL ARTICLE

Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal Human Activity Recognition

Razvan BrinzeaBulat KhaertdinovStylianos Asteriadis

Year: 2022 Journal:   2022 International Joint Conference on Neural Networks (IJCNN) Pages: 01-08

Abstract

Human Activity Recognition is a field of research where input data can take many forms. Each of the possible input modalities describes human behaviour in a different way, and each has its own strengths and weaknesses. We explore the hypothesis that leveraging multiple modalities can lead to better recognition. Since manual annotation of input data is expensive and time-consuming, the emphasis is made on self-supervised methods which can learn useful feature representations without any ground truth labels. We extend a number of recent contrastive self-supervised approaches for the task of Human Activity Recognition, leveraging inertial and skeleton data. Furthermore, we propose a flexible, general-purpose framework for performing multimodal self-supervised learning, named Contrastive Multiview Coding with Cross-Modal Knowledge Mining (CMC-CMKM). This framework exploits modality-specific knowledge in order to mitigate the limitations of typical self-supervised frameworks. The extensive experiments on two widely-used datasets demonstrate that the suggested framework significantly outperforms contrastive unimodal and multimodal baselines on different scenarios, including fully-supervised fine-tuning, activity retrieval and semi-supervised learning. Furthermore, it shows performance competitive even compared to supervised methods.

Keywords:
Computer science Artificial intelligence Modalities Machine learning Exploit Modality (human–computer interaction) Supervised learning Modal Ground truth Natural language processing Pattern recognition (psychology) Artificial neural network

Metrics

20
Cited By
1.38
FWCI (Field Weighted Citation Impact)
52
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Context-Aware Activity Recognition Systems
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.