Multi-Scale Masked Autoencoders for Cross-Session Emotion Recognition

Miaoqi Pang; Hongtao Wang; Jiayang Huang; Chi‐Man Vong; Zhiqiang Zeng; Chuangquan Chen

doi:10.1109/tnsre.2024.3389037

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-Scale Masked Autoencoders for Cross-Session Emotion Recognition

Miaoqi Pang Hongtao Wang Jiayang Huang Chi‐Man Vong Zhiqiang Zeng Chuangquan Chen

Year: 2024 Journal: IEEE Transactions on Neural Systems and Rehabilitation Engineering Vol: 32 Pages: 1637-1646 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tnsre.2024.3389037

Get Full-Text PDF Get Analytical Report

Abstract

Affective brain-computer interfaces (aBCIs) have garnered widespread applications, with remarkable advancements in utilizing electroencephalogram (EEG) technology for emotion recognition. However, the time-consuming process of annotating EEG data, inherent individual differences, non-stationary characteristics of EEG data, and noise artifacts in EEG data collection pose formidable challenges in developing subject-specific cross-session emotion recognition models. To simultaneously address these challenges, we propose a unified pre-training framework based on multi-scale masked autoencoders (MSMAE), which utilizes large-scale unlabeled EEG signals from multiple subjects and sessions to extract noise-robust, subject-invariant, and temporal-invariant features. We subsequently fine-tune the obtained generalized features with only a small amount of labeled data from a specific subject for personalization and enable cross-session emotion recognition. Our framework emphasizes: 1) Multi-scale representation to capture diverse aspects of EEG signals, obtaining comprehensive information; 2) An improved masking mechanism for robust channel-level representation learning, addressing missing channel issues while preserving inter-channel relationships; and 3) Invariance learning for regional correlations in spatial-level representation, minimizing inter-subject and inter-session variances. Under these elaborate designs, the proposed MSMAE exhibits a remarkable ability to decode emotional states from a different session of EEG data during the testing phase. Extensive experiments conducted on the two publicly available datasets, i.e., SEED and SEED-IV, demonstrate that the proposed MSMAE consistently achieves stable results and outperforms competitive baseline methods in cross-session emotion recognition.

Keywords:

Computer science Electroencephalography Session (web analytics) Artificial intelligence Pattern recognition (psychology) Representation (politics) Speech recognition Machine learning Psychology

Metrics

Cited By

14.76

FWCI (Field Weighted Citation Impact)

Refs

0.98

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

EEG and Brain-Computer Interfaces

Life Sciences → Neuroscience → Cognitive Neuroscience

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

ECG Monitoring and Analysis

Health Sciences → Medicine → Cardiology and Cardiovascular Medicine

Multi-Scale Masked Autoencoders for Cross-Session Emotion Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption

Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption

Enhancing Emotion Recognition with Pre-trained Masked Autoencoders and Sequential Learning

MultiMAE: Multi-modal Multi-task Masked Autoencoders

Multi-Scale Hyperbolic Contrastive Learning for Cross-Subject EEG Emotion Recognition