BOOK-CHAPTER

Emotion Recognition Using Multimodal Fusion Models

Abstract

In the coming years, service robotics will advance significantly if robots can understand emotions. However, humans still haven't perfected the ability to effectively recognize some emotions. Since robots are incapable of feeling emotion and cannot empathize, the process of recognizing emotions is significantly more difficult for them. Emotion recognition (ER) is done using machine learning and deep learning. As it is tricky to extract discriminative characteristics to distinguish the implication in human emotions with abstruse concepts and numerous manifestations, facial emotion detection is a difficult task in emotion computing. Furthermore, the best way to utilize both aural and video-based information is still up for debate. This work offers ideas pertaining to emotion recognition, addresses the application of deep-learning techniques in ER, and evaluates and synthesizes emotion recognition methodology and models. The quantity of observed emotions, characteristics extracted, categorization scheme, and database consistency all affect accuracy of models. The following issues are covered by a variety of hypotheses about the technique of emotional identification and mainstream emotional science. This would drive research into better physiological parameters of the state of science currently and its issues with emotions. This review focuses on different deep-learning multimodal approaches. We have reviewed various multimodal approaches based on their inputs such as image, text, and facial and body physiology.

Keywords:
Fusion Computer science Emotion recognition Artificial intelligence Speech recognition Linguistics Philosophy

Metrics

2
Cited By
4.45
FWCI (Field Weighted Citation Impact)
0
Refs
0.93
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
© 2026 ScienceGate Book Chapters — All rights reserved.