Neha DewanganSunandan MandalKavita ThakurBrajesh Kumar Singh
Emotion is an essential part of human communication. People communicate with emotions through words and body language. Speech emotion recognition is a well-known technique to detect emotions from speech signals. Here, we have proposed binary and multiclass classification models that combine two cepstral coefficients, i.e., Mel-Frequency Cepstral Coefficient (MFCC) and Mel-Frequency Magnitude Coefficient (MFMC) to extract the spectral features from the speech signals and classify them using backpropagation artificial neural network (BPANN). In our study, it is found that when significant features of both spectral coefficients are combined it shows improvement in training and classification results. The proposed model achieved 85.24% accuracy for the multiclass classification of seven emotions using statistically significant features. The proposed model also achieved 100% accuracy for the binary classification of Happy versus Sad emotion and Sad versus Fear emotion.
Neha DewanganKavita ThakurSunandan MandalBikesh Kumar Singh
Tanmoy RoyTshilidzi MarwalaSnehashish Chakraverty
Siba Prasad MishraPankaj WaruleSuman Deb
Pengxu JiangHongliang FuHuawei TaoPeizhi LeiLi Zhao