Traditionally acoustic emotion recognition system has been using Gaussian Mixture Models (GMMs) for classification. However, the Gaussian Mixture Models do not make good use of multiple frames of input data and can not exploit the high-dimensional dependencies of features efficiently, thus it's hard to improve the recognition accuracy for achieving a better result. Deep neural networks (DNNs) are artificial neural networks having more than one hidden layer, which are first pretrained layer by layer and then fine-tuned using back propagation algorithm. The well-trained deep neural networks are capable of modeling complex and non-linear features of input training data and can better predict the probability distribution over classification labels. In this paper, we used DNNs to replace GMMs in the recognition system architecture and conducted a series of experiments using neural networks that involved deep learning. Six discrete emotional states are classified based on these two kinds of classifiers. Our work focused on the performance of DNNs and experiments showed that the best recognition rate achieved by DNN-based system increased by 8.2 percentage points compared with baselines GMMs.
Mr. Janardhan Reddy DGiribabu SadineniP.D.S. SushiraCh.Sai LohithaP.V.B. RushithaT. Naga LakshmiV. Praveena
R. SujathaJyotir Moy ChatterjeeBaibhav PathyYuchen Hu
Muhammad Fahreza AlghifariTeddy Surya GunawanMira Kartiwi
Omayma MahmoudiMouncef Filali Bouami