Speech Emotion Recognition using Spectral Images and Convolutional Neural Network

Siba Prasad Mishra; Pankaj Warule; Suman Deb

doi:10.1109/indicon59947.2023.10440885

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech Emotion Recognition using Spectral Images and Convolutional Neural Network

Siba Prasad Mishra Pankaj Warule Suman Deb

Year: 2023 Pages: 161-166

DOI: 10.1109/indicon59947.2023.10440885

Get Full-Text PDF Get Analytical Report

Abstract

Employing a computer for automatic speech-emotion identification is a formidable and intricate undertaking. Speech emotion recognition (SER) has gained significant popularity among academics for over three decades due to its wide range of applications in many industries, such as medical treatment, marketing, customer service, driving, internet searching, and education. Researchers used many approaches to enhance the efficiency of emotion categorization. In our work, we used the images of the mel frequency cepstral coefficient (MFCC), mel-spectrogram, and a combination of both as feature input to a 2D convolutional neural network (2D-CNN) classifier to classify the emotion. We trained the model with individuals and a combination of images of the proposed feature to classify the emotion. Based on the experimental results, we observed that the suggested feature combination MFCC and mel-spectrogram performed superior to the individual in terms of speech signal emotion recognition. To assess the efficacy of our features, we used three datasets: TESS, RAVDESS, and EMO-DB. For the EMO-DB, TESS, and RAVDESS datasets, we found that the accuracy of emotion categorization was 88.89%, 100%, and 81.2%, respectively.

Keywords:

Computer science Convolutional neural network Speech recognition Emotion recognition Artificial intelligence Pattern recognition (psychology)

Metrics

Cited By

0.42

FWCI (Field Weighted Citation Impact)

Refs

0.64

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Infant Health and Development

Health Sciences → Health Professions → Pharmacy

Speech Emotion Recognition using Spectral Images and Convolutional Neural Network

Abstract

Metrics

Citation History

Topics

Related Documents

Speech emotion recognition using 2D-convolutional neural network

Emotion Recognition using Speech Data with Convolutional Neural Network

Efficient Speech to Emotion Recognition Using Convolutional Neural Network

Multi-featured Speech Emotion Recognition Using Extended Convolutional Neural Network

Speech Emotion Recognition: Music Generation using Convolutional Neural Network Approach