In this paper, a Mandarin speech based emotion classification method is presented. Five primary human emotions including anger, boredom, happiness, neutral and sadness are investigated. In emotion classification of speech signals, the conventional features are statistics of fundamental frequency, loudness, duration and voice quality. However, the recognition accuracy of systems employing these features degrades substantially when more than two valence emotion categories are invoked. For speech emotion recognition, we select 16 LPC coefficients, 12 LPCC components, 16 LFPC components, 16 PLP coefficients, 20 MFCC components and jitter as the basic features to form the feature vector. A Mandarin corpus recorded by 12 non-professional speakers is employed. The recognizer presented in this paper is based on three recognition techniques: LDA, K-NN, and HMMs. Experimental results show that the selected features are robust and effective for emotion recognition, not only in the arousal dimension but also in the valence dimension.
Tsang-Long PaoJun-Heng YehYu-Te Che
Elif BozkurtEngin ErzinÇiğdem Eroğlu ErdemA. Tanju Erdem
Jun-Heng YehTsang-Long PaoChing‐Yi LinYao-Wei TsaiYu-Te Chen
Oh‐Wook KwonKwokleung ChanJiucang HaoTe-Won Lee