Changjiang JiangRong MaoGeng LiuMingyi Wang
In speech emotion recognition,the extraction of Mel-frequency Cepstral Coefficients (MFCC) will lose many useful spectral feature information,resulting in low recognition accuracy. Therefore,a method for combining MFCC and Mel spectrum for speech emotion recognition is proposed. First,M FCC are extracted from the audio signal,and then the Mel spectrum is extracted from the audio signal. Finally,the fused audio features are used to support vector machine classification. The experimental results in the EMODB data set show that the fusion of multiple features has higher classification accuracy than the classifier using single feature,and the proposed method can effectively improve the recognition accuracy.
WANG Zhongmin, LIU Ge, SONG Hui
Shanshan XiangSadiyagul AnwerHankiz YilahunAskar Hamdulla
Xiaoping ZengDong LiGuanghui ChenQi Dong