JOURNAL ARTICLE

Speech Emotion Recognition Using Convolutional Neural Networks with Attention Mechanism

Konstantinos C. MountzourisIsidoros PerikosIoannis Hatzilygeroudis

Year: 2023 Journal:   Electronics Vol: 12 (20)Pages: 4376-4376   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Speech emotion recognition (SER) is an interesting and difficult problem to handle. In this paper, we deal with it through the implementation of deep learning networks. We have designed and implemented six different deep learning networks, a deep belief network (DBN), a simple deep neural network (SDNN), an LSTM network (LSTM), an LSTM network with the addition of an attention mechanism (LSTM-ATN), a convolutional neural network (CNN), and a convolutional neural network with the addition of an attention mechanism (CNN-ATN), having in mind, apart from solving the SER problem, to test the impact of the attention mechanism on the results. Dropout and batch normalization techniques are also used to improve the generalization ability (prevention of overfitting) of the models as well as to speed up the training process. The Surrey Audio–Visual Expressed Emotion (SAVEE) database and the Ryerson Audio–Visual Database (RAVDESS) were used for the training and evaluation of our models. The results showed that the networks with the addition of the attention mechanism did better than the others. Furthermore, they showed that the CNN-ATN was the best among the tested networks, achieving an accuracy of 74% for the SAVEE database and 77% for the RAVDESS, and exceeding existing state-of-the-art systems for the same datasets.

Keywords:
Computer science Overfitting Convolutional neural network Deep learning Dropout (neural networks) Artificial intelligence Mechanism (biology) Artificial neural network Normalization (sociology) Machine learning Speech recognition

Metrics

14
Cited By
5.83
FWCI (Field Weighted Citation Impact)
38
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Speech Emotion Recognition using Convolutional Neural Networks with Attention Mechanisms

A. PoongodaiY. V. NandiniT MounikaA J KarishmaN. K. Senthil Kumar

Journal:   International Research Journal of Innovations in Engineering and Technology Year: 2025 Vol: 09 (Special Issue ICCIS)Pages: 162-167
JOURNAL ARTICLE

Speech Emotion Recognition using Convolutional Neural Networks and Recurrent Neural Networks with Attention Model

Journal:   Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering Year: 2019
JOURNAL ARTICLE

Speech Emotion Recognition Using Convolutional Neural Networks

Narsi Reddy

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2024 Vol: 12 (8)Pages: 30-36
BOOK-CHAPTER

Speech Emotion Recognition Using Convolutional Neural Networks

Anunya SharmaKiran MalikPoonam Bansal

Communications in computer and information science Year: 2024 Pages: 90-101
© 2026 ScienceGate Book Chapters — All rights reserved.