An Ensemble Transformer Model For Speech Emotion Recognition

Priyanshu Mohanty; S. Harihara Sudhan; Aman Sinha; S. Vinila Jinny

doi:10.1109/icccnt56998.2023.10307561

ScienceGate Book Chapters

JOURNAL ARTICLE

An Ensemble Transformer Model For Speech Emotion Recognition

Priyanshu Mohanty S. Harihara Sudhan Aman Sinha S. Vinila Jinny

Year: 2023 Vol: 85 Pages: 1-7

DOI: 10.1109/icccnt56998.2023.10307561

Get Full-Text PDF Get Analytical Report

Abstract

Audio constitutes an indisputably indispensable attribute of nature. The study of sound becomes an interesting endeavour in this regard to expand our knowledge of nature and its pertinent peculiarities. One such characteristic intrinsically associated with audio is the emotion or the sentiment that it relays. Emotions play a deep role in understanding the human psyche and mindset in a behavioural context and advance domains of sociology, psychology, et cetera. It is with this vision of advancement that in this paper, we explore the realm of speech emotion recognition and undertake the endeavour to a comparative study of previously extant popular deep learning algorithms like - CNN & LSTM with certain temperaments of our own, in terms of the architecture and the hyperparameters used, and take into account the performance metrics to propose a transformer model incorporating bidirectional-LSTM, encoder, decoder and scaled-dot product attention. A popular standard - the RAVDESS dataset has been used for this purpose. The model has shown promising results on a preliminary basis when subjected to different metrics of testing and validation, and can potentially be employed in high-precision requisite systems.

Keywords:

Computer science Transformer Artificial intelligence Deep learning Speech recognition Machine learning

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.21

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

An Ensemble Transformer Model For Speech Emotion Recognition

Abstract

Metrics

Topics

Related Documents

Ensemble softmax regression model for speech emotion recognition

Multi-Emotion Recognition Model with Text and Speech Ensemble

An Ensemble Model for Multi-Level Speech Emotion Recognition

Transformer-based ensemble deep learning model for EEG-based emotion recognition

Speech emotion recognition with ensemble learning methods