This paper presents a deep learning method application to the extraction of emotions included in Chinese speech with a deep belief network (DBN) structure. Eight proper features such as pitch, mel frequency cepstrum coefficient (MFCC) are chosen from Mandarin speech used as network inputs, and a DBN classifier is used instead of traditional shallow learning methods to recognition of emotions. Experiment studies have proven that its recognition rate is higher than that of the traditional back propagation (BP) method and support vector machine (SVM) classifier.
Shiqing ZhangXiaoming ZhaoYuelong ChuangWenping GuoYing‐Yeh Chen
Dong LiuLongxi ChenZhiyong WangGuangqiang Diao