Yuzhe ZhangHuan LiuYang XiaoMohammed AmoonDalin ZhangDi WangShusen YangChai Quek
The critical importance of monitoring and recognizing human emotional states in healthcare has led to a surge in proposals for EEG-based multimodal emotion recognition in recent years. However, practical challenges arise in acquiring EEG signals in daily healthcare settings due to stringent data acquisition conditions, resulting in the issue of incomplete modalities. Existing studies have turned to knowledge distillation as a means to mitigate this problem by transferring knowledge from multimodal networks to unimodal ones. However, these methods are constrained by the use of a single teacher model to transfer integrated feature extraction knowledge, particularly concerning spatial and temporal features in EEG data. To address this limitation, we propose a multi-teacher knowledge distillation framework enhanced with a Large Language Model (LLM), aimed at facilitating effective feature learning in the student network by transferring knowledge of extracting integrated features. Specifically, we employ an LLM as the teacher for extracting temporal features and a graph convolutional neural network for extracting spatial features. To further enhance knowledge distillation, we introduce causal masking and a confidence indicator into the LLM to facilitate the transfer of the most discriminative features. Extensive testing on the DEAP and MAHNOB-HCI datasets demonstrates that our model outperforms existing methods in the modality-incomplete scenario. This study underscores the potential application of large models in this field.
Seonggyu LeeYoungdo AhnJong Won Shin
Mehedi Hasan BijoyDejan PorjazovskiTamás GrószMikko Kurimo
Mehmet Ali SarikayaGökhan İnce
Lipeng GanRunze CaoNing LiMan YangXiaochao Li
Ziyu JiaYucheng LiuHaichao WangTianzi Jiang