JOURNAL ARTICLE

Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition

Hua ZhangRuoyun GouJili ShangFangyao ShenYifan WuGuojun Dai

Year: 2021 Journal:   Frontiers in Physiology Vol: 12 Pages: 643202-643202   Publisher: Frontiers Media

Abstract

Speech emotion recognition (SER) is a difficult and challenging task because of the affective variances between different speakers. The performances of SER are extremely reliant on the extracted features from speech signals. To establish an effective features extracting and classification model is still a challenging task. In this paper, we propose a new method for SER based on Deep Convolution Neural Network (DCNN) and Bidirectional Long Short-Term Memory with Attention (BLSTMwA) model (DCNN-BLSTMwA). We first preprocess the speech samples by data enhancement and datasets balancing. Secondly, we extract three-channel of log Mel-spectrograms (static, delta, and delta-delta) as DCNN input. Then the DCNN model pre-trained on ImageNet dataset is applied to generate the segment-level features. We stack these features of a sentence into utterance-level features. Next, we adopt BLSTM to learn the high-level emotional features for temporal summarization, followed by an attention layer which can focus on emotionally relevant features. Finally, the learned high-level emotional features are fed into the Deep Neural Network (DNN) to predict the final emotion. Experiments on EMO-DB and IEMOCAP database obtain the unweighted average recall (UAR) of 87.86 and 68.50%, respectively, which are better than most popular SER methods and demonstrate the effectiveness of our propose method.

Keywords:
Computer science Automatic summarization Speech recognition Artificial intelligence Convolutional neural network Spectrogram Task (project management) Deep learning Focus (optics) Sentence Utterance Convolution (computer science) Artificial neural network Pattern recognition (psychology)

Metrics

38
Cited By
6.50
FWCI (Field Weighted Citation Impact)
34
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

A pre-trained model vs dedicated convolution neural networks for emotion recognition

Asmaa Yaseen NawafWesam M. Jasim

Journal:   International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering Year: 2022 Vol: 13 (1)Pages: 1123-1123
JOURNAL ARTICLE

FExR.A-DCNN: Facial Emotion Recognition with Attention mechanism using Deep Convolution Neural Network

Pratishtha VermaVasu AggrawalJyoti Maggu

Journal:   Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing Year: 2022 Pages: 196-203
JOURNAL ARTICLE

EEG-Based Emotion Recognition with Deep Convolution Neural Network

Hui-Min ShaoJianguo WangYu WangYuan YaoJunjiang Liu

Journal:   2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS) Year: 2019 Pages: 1225-1229
© 2026 ScienceGate Book Chapters — All rights reserved.