With the development of artificial intelligence, emotion recognition has become a hot topic in the field of human-computer interaction. This paper focuses on the application and optimization of deep convolutional neural networks (CNNs) in multimodal emotion recognition. Multimodal emotion recognition involves analyzing data from different sourcessuch as voice, facial expressions, and textto more accurately identify and interpret human emotional states. This paper first reviews the basic theories and methods of multimodal data processing, then details the structure and function of deep convolutional neural networks, particularly their advantages in handling various types of data. By innovating and optimizing network structures, loss functions, and training strategies, we have improved the model's accuracy in emotion recognition. Ultimately, experimental results show that the optimized CNN model demonstrates superior performance in multimodal emotion recognition tasks.
Hao TangWei LiuWei‐Long ZhengBao‐Liang Lu
Shubham TamhaneApeksha ShriraoManthan ShahDeepali Patil
Shubham TamhaneApeksha ShriraoManthan ShahDeepali Patil
Ashutosh TiwariSatyam KumarTushar MehrotraRajneesh Kumar Singh