P. Anil KumarA. Lakshmi ParvathiS. Sruthi
This research investigates the effectiveness of different feature extraction techniques for Speech Emotion Recognition (SER) and explores the potential of machine learning to improve accuracy. While Gamma Tone Cepstral Coefficients (GTCC) are designed to capture auditory features that align with human perception, they may not adequately capture subtle emotional cues. Mel Frequency Cepstral Coefficients (MFCC), on the other hand, have shown promise in effectively representing speech signals for emotion recognition. This work compares GTCC-based feature extraction with MFCC and employs a Cubic Support Vector Machine (SVM) classifier to enhance the system's ability to learn and distinguish between emotional states. Using datasets like CREMA-D and SAVEE, this research aims to advance the development of SER systems with improved accuracy and sensitivity for applications in human-computer interaction. Major Findings: This study focuses on enhancing GTCC features for better emotion recognition in speech. While MFCC showed higher accuracy, improvements to GTCC narrowed the performance gap. The results indicate that refined GTCC, combined with Cubic SVM, holds promise for effective SER.
Sreeja Sasidharan RajeswariG. GopakumarManjusha Nair
Anil Kumar PagidirayiB. Anuradha
Branko MarkovićJovan GalićĐorđe GrozdićSlobodan T. JovičićMiomir Mijić
Akash ChaurasiyaGovind Krishan GargR.S. GaudBodhi ChakrabortyShashikant Gupta