Speech Emotion Recognition with Local-Global Aware Deep Representation Learning

Jiaxing Liu; Zhilei Liu; Longbiao Wang; Lili Guo; Jianwu Dang

doi:10.1109/icassp40776.2020.9053192

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech Emotion Recognition with Local-Global Aware Deep Representation Learning

Jiaxing Liu Zhilei Liu Longbiao Wang Lili Guo Jianwu Dang

Year: 2020 Pages: 7174-7178

DOI: 10.1109/icassp40776.2020.9053192

Get Full-Text PDF Get Analytical Report

Abstract

Convolutional neural network (CNN) based deep representation learning methods for speech emotion recognition (SER) have demonstrated great success. The basic design of CNN restricts the ability to model only local information well. Capsule network (CapsNet) can overcome the shortages of CNNs to capture the shallow global features from the spectrogram, although CapsNet cannot learn the local and deep global information. In this paper, we propose a local-global aware deep representation learning system that mainly includes two modules. One module contains a multi-scale CNN, time- frequency CNN (TFCNN) to learn the local representation. In the other module, we introduce a structure with dense connections of multiple blocks to learn shallow and deep global information. Every block in this structure is a complete CapsNet improved by a new routing algorithm. The local and global representations are fed to the classifier and achieve an absolute increase of at least 4.25% than benchmarks on IEMOCAP.

Keywords:

Computer science Convolutional neural network Deep learning Artificial intelligence Spectrogram Representation (politics) Feature learning Block (permutation group theory) Classifier (UML) Pattern recognition (psychology) Speech recognition

Metrics

Cited By

7.21

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Emotion Recognition with Local-Global Aware Deep Representation Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Deep Representation Learning for Speech Emotion Recognition

Adaptive Domain-Aware Representation Learning for Speech Emotion Recognition

Speech Emotion Recognition with Global-Aware Fusion on Multi-Scale Feature Representation

Speech Emotion Recognition Using Multi-Scale Global–Local Representation Learning with Feature Pyramid Network

Speech Emotion Recognition with deep learning