Speech Emotion Recognition Using LSTM

Sarika Gaind; Shubham Budhiraja; Deepak Gauba; Manpreet Kaur

doi:10.1201/9781003332312-25

ScienceGate Book Chapters

BOOK-CHAPTER

Speech Emotion Recognition Using LSTM

Sarika Gaind Shubham Budhiraja Deepak Gauba Manpreet Kaur

Year: 2023 Apple Academic Press eBooks Pages: 317-325

DOI: 10.1201/9781003332312-25

Get Full-Text PDF Get Analytical Report

Abstract

Effective speech emotion recognition is very critical for improving the experience of human–machine interaction to a great level. Therefore, this field demands attention of researchers to develop new methods for creating next-generation virtual chat assistants. Thus, the aim of this paper is to propose a model by applying deep learning to analyze the emotional state through speech and improve the performance of virtual personal assistants such as Siri, Alexa. The analysis of voice samples is done on features such as amplitude, frequency, and Mel-Frequency Cepstral Coefficients (MFCCs). Speech signals from two datasets SAVEE and CREMA-D are extracted. Datasets with six emotions happy, sad, fear, anger, disgust, and neutral are used to get better insights on emotion analysis using the Recurrent Neural Networks classifier. Working of an LSTM model is reviewed and an accuracy of 92.3% is achieved with Binary Cross Entropy function as the loss function. The efficient Adam version of stochastic gradient descent was used for optimizing the network, and the binary cross entropy loss function. The Adam optimization algorithm is an extension to stochastic gradient descent that has recently seen broader adoption for deep learning applications in computer vision and natural language processing. It is a method that calculates the learning rate for each parameter that is shown by its developers to work well in practice and to compare favorably against other adaptive learning algorithms. The developers also propose the default values for the Adam optimizer parameters as Beta1 – 0.9, Beta2 – 0.999 and Epsilon – 10-8 .

Keywords:

Speech recognition Computer science Emotion recognition Artificial intelligence Natural language processing Pattern recognition (psychology)

Metrics

Cited By

5.69

FWCI (Field Weighted Citation Impact)

Refs

0.96

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech Emotion Recognition Using LSTM

Abstract

Metrics

Citation History

Topics

Related Documents

Speech Emotion Recognition Using LSTM

Speech Emotion Recognition Using LSTM Model

SPEECH EMOTION RECOGNITION USING CNN-LSTM

Speech Emotion Recognition Using GRU and LSTM

Speech Emotion Recognition Using CNN and LSTM