Applications of interaction between humans and computers, such as emergency response systems, heavily depend on emotion recognition. In this experiment, we present an investigation into Speech Emotion Recognition (SER) specifically in the context of emergency calls using a meticulously curated dataset. This dataset comprises audio recordings from 18 speakers, each expressing four distinct emotions (angry, drunk, painful, and stressful) while reading predefined emergency scenarios. We conducted comprehensive preprocessing, which involved feature extraction using Mel-Frequency Cepstral Coefficients (MFCCs), Chroma (Pitch Classes), and Mel Spectrogram Frequency. The performance of many machine learning models, such as the KNeighbors Classifier, MLP Classifier, Random Forest Classifier, Gradient Boost Classifier, SVM, and Logistic Regression, was then assessed by dividing the dataset into training and testing sets. Our results reveal that the KNeighbors Classifier outperforms other models, achieving an accuracy of 67.06% and maintaining balanced performance metrics. These results offer insightful information about whether SER is practical for emergency call applications. This research contributes to the understanding of emotion recognition in critical situations and can enhance the efficiency of emergency response systems by automating emotion assessment in distress calls. Our findings have practical implications for the development of intelligent systems that can better assist emergency service providers.
Manoara BegumMd Akash RahmanTanjim MahmudMohammad Shahadat HossainKarl Andersson
S. G. ShailaA. SindhuL. MonishD. ShivammaB. Vaishali
Y.P. SinghNeetu NeetuShikha Rani