Adversarial Data Augmentation Network for Speech Emotion Recognition

Yi Lu; Man‐Wai Mak

doi:10.1109/apsipaasc47483.2019.9023347

ScienceGate Book Chapters

JOURNAL ARTICLE

Adversarial Data Augmentation Network for Speech Emotion Recognition

Yi Lu Man‐Wai Mak

Year: 2019 Pages: 529-534

DOI: 10.1109/apsipaasc47483.2019.9023347

Get Full-Text PDF Get Analytical Report

Abstract

Insufficient data is a common issue in training deep learning models. With the introduction of generative adversarial networks (GANs), data augmentation has become a promising solution to this problem. This paper investigates whether data augmentation can help improve speech emotion recognition. Unlike conventional GANs, we train a GAN with an autoencoder, where the input to the discriminator comes from the bottleneck layer of the autoencoder and the output of the generator. The synthetic samples can be obtained from the decoder, using the output of the generator as the decoder's input. The combined network, namely adversarial data augmentation network (ADAN), can generate samples that share common latent representation with the real data. Evaluations on EmoDB and IEMOCAP show that using OpenSmile features as input, the ADAN can produce augmented data that make an ordinary SVM classifier outperforms an RNN classifier with local attention and make a DNN competitive to some state-of-the art systems.

Keywords:

Computer science Autoencoder Discriminator Classifier (UML) Artificial intelligence Adversarial system Recurrent neural network Generator (circuit theory) Speech recognition Machine learning Deep learning Pattern recognition (psychology) Artificial neural network

Metrics

Cited By

3.45

FWCI (Field Weighted Citation Impact)

Refs

0.94

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Adversarial Data Augmentation Network for Speech Emotion Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Improving Speech Emotion Recognition With Adversarial Data Augmentation Network

Facial Emotion Recognition Data Augmentation using Generative Adversarial Network

Speech emotion recognition using data augmentation

Speech Emotion Recognition Using Data Augmentation

Speech emotion recognition using data augmentation method by cycle-generative adversarial networks