JOURNAL ARTICLE

An Ensemble Transformer Model For Speech Emotion Recognition

Abstract

Audio constitutes an indisputably indispensable attribute of nature. The study of sound becomes an interesting endeavour in this regard to expand our knowledge of nature and its pertinent peculiarities. One such characteristic intrinsically associated with audio is the emotion or the sentiment that it relays. Emotions play a deep role in understanding the human psyche and mindset in a behavioural context and advance domains of sociology, psychology, et cetera. It is with this vision of advancement that in this paper, we explore the realm of speech emotion recognition and undertake the endeavour to a comparative study of previously extant popular deep learning algorithms like - CNN & LSTM with certain temperaments of our own, in terms of the architecture and the hyperparameters used, and take into account the performance metrics to propose a transformer model incorporating bidirectional-LSTM, encoder, decoder and scaled-dot product attention. A popular standard - the RAVDESS dataset has been used for this purpose. The model has shown promising results on a preliminary basis when subjected to different metrics of testing and validation, and can potentially be employed in high-precision requisite systems.

Keywords:
Computer science Transformer Artificial intelligence Deep learning Speech recognition Machine learning

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
24
Refs
0.21
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Ensemble softmax regression model for speech emotion recognition

Yaxin SunGuihua Wen

Journal:   Multimedia Tools and Applications Year: 2016 Vol: 76 (6)Pages: 8305-8328
JOURNAL ARTICLE

Multi-Emotion Recognition Model with Text and Speech Ensemble

Moung Ho YiMyung Jin LimJu Hyun Shin

Journal:   Korean Institute of Smart Media Year: 2022 Vol: 11 (8)Pages: 65-72
JOURNAL ARTICLE

An Ensemble Model for Multi-Level Speech Emotion Recognition

Chunjun ZhengChunli WangNing Jia

Journal:   Applied Sciences Year: 2019 Vol: 10 (1)Pages: 205-205
© 2026 ScienceGate Book Chapters — All rights reserved.