A Robust Speech Emotion Detection Mechanism Using Supervised Deep Learning Paradigms

Divya Saini; Kailash Shaw

doi:10.1109/aikiie60097.2023.10390312

ScienceGate Book Chapters

JOURNAL ARTICLE

A Robust Speech Emotion Detection Mechanism Using Supervised Deep Learning Paradigms

Divya Saini Kailash Shaw

Year: 2023 Vol: 4 Pages: 1-8

DOI: 10.1109/aikiie60097.2023.10390312

Get Full-Text PDF Get Analytical Report

Abstract

The research applies deep learning to SER, or voice recording emotion detection. Precision vocal emotion recognition has several applications, including human-computer interaction, virtual assistants, and healthcare. This study uses emotional-labeled spoken utterances to build an accurate SER system used to train the deep learning models like convolutional neural networks (CNNs). These models are popular for speech emotion recognition because they can learn complex voice signal patterns that indicate different moods. Accurate diagnosis involves more detailed sound analysis and mood or emotion recognition. This paper presents a comprehensive framework for SER from recorded audio samples using digital signal processing advances. The dataset's speech features spectrograms and pitch picture train the models. Speech analysis uses these features because they capture vocal tract and pitch aspects of the speech stream. After training, the models' classification accuracy their ability to correctly recognize unseen speech samples' emotional content is examined. To assess its performance, the best model is compared to the most advanced methods. Vgg16 CNN outperformed Mel-Spectrogram-featured Convolutional Neural Networks in this work. Emotion sound samples processed with CNN and mel-spectrogram achieved 89% accuracy, with better results using transfer learning (CNN-VGG16). Other classifiers like SVM, Logistic Regression, Decision Tree, and Random Forest yielded lower accuracy (60%-75%). Further research should explore composite feature sets for improved classification.

Keywords:

Spectrogram Computer science Speech recognition Convolutional neural network Artificial intelligence Deep learning Support vector machine Transfer of learning Random forest Feature (linguistics) Feature extraction Vocal tract Decision tree Feature learning Pattern recognition (psychology)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.26

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

A Robust Speech Emotion Detection Mechanism Using Supervised Deep Learning Paradigms

Abstract

Metrics

Topics

Related Documents

Speech Based Emotion Detection Using Deep Learning

Personified Emotion Detection from Speech using Supervised Machine Learning

Emotion State Detection via Speech using Deep Learning

Robust Speech Emotion Recognition Using Recent Self-Supervised Learning with Data Augmentation

Emotion Detection using Speech and Face in Deep Learning