JOURNAL ARTICLE

Improving Speech Emotion Recognition Using Data Augmentation and Balancing Techniques

Abstract

Speech Emotion Recognition (SER) is a challenging task due to the complexity and variability of human emotions. In this paper, we propose an innovative approach to improve SER performance on the EMODB dataset. Our approach employs data augmentation techniques, such as noise addition and spectrogram shift, as well as balancing techniques, including random oversampling. We also extract five different features from the dataset samples: MFCC, Chroma, Mel Spectrogram, ZCR, and RMS. We compare the performance of four different classifiers - MLP, SVM, KNN, and CNN - with and without the use of our proposed approach. Our results demonstrate that the proposed approach significantly enhances the accuracy of speech emotion recognition compared to the approach without data augmentation and balancing techniques. Our experiments reveal that the proposed approach achieves higher accuracy and F1-score compared to other approaches, with MLP and CNN models achieving 100% accuracy. These findings highlight the effectiveness of data augmentation and balancing techniques in improving the performance of speech emotion recognition. Moreover, our approach holds great potential for application in various real-life scenarios, including mental health monitoring, human-robot interaction, and speech-based virtual assistants.

Keywords:
Computer science Spectrogram Speech recognition Artificial intelligence Random forest Mel-frequency cepstrum Support vector machine Oversampling Noise (video) Task (project management) Pattern recognition (psychology) Feature extraction Machine learning

Metrics

5
Cited By
2.08
FWCI (Field Weighted Citation Impact)
26
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
EEG and Brain-Computer Interfaces
Life Sciences →  Neuroscience →  Cognitive Neuroscience
© 2026 ScienceGate Book Chapters — All rights reserved.