JOURNAL ARTICLE

Adversarial Data Augmentation Network for Speech Emotion Recognition

Abstract

Insufficient data is a common issue in training deep learning models. With the introduction of generative adversarial networks (GANs), data augmentation has become a promising solution to this problem. This paper investigates whether data augmentation can help improve speech emotion recognition. Unlike conventional GANs, we train a GAN with an autoencoder, where the input to the discriminator comes from the bottleneck layer of the autoencoder and the output of the generator. The synthetic samples can be obtained from the decoder, using the output of the generator as the decoder's input. The combined network, namely adversarial data augmentation network (ADAN), can generate samples that share common latent representation with the real data. Evaluations on EmoDB and IEMOCAP show that using OpenSmile features as input, the ADAN can produce augmented data that make an ordinary SVM classifier outperforms an RNN classifier with local attention and make a DNN competitive to some state-of-the art systems.

Keywords:
Computer science Autoencoder Discriminator Classifier (UML) Artificial intelligence Adversarial system Recurrent neural network Generator (circuit theory) Speech recognition Machine learning Deep learning Pattern recognition (psychology) Artificial neural network

Metrics

30
Cited By
3.45
FWCI (Field Weighted Citation Impact)
40
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Improving Speech Emotion Recognition With Adversarial Data Augmentation Network

Yi LuMan‐Wai Mak

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2020 Vol: 33 (1)Pages: 172-184
JOURNAL ARTICLE

Facial Emotion Recognition Data Augmentation using Generative Adversarial Network

Jin Yong KimGeun‐Sik Jo

Journal:   Journal of KIISE Year: 2021 Vol: 48 (4)Pages: 398-404
JOURNAL ARTICLE

Speech emotion recognition using data augmentation

V.M. PraseethaP P Joby

Journal:   International Journal of Speech Technology Year: 2021 Vol: 25 (4)Pages: 783-792
JOURNAL ARTICLE

Speech emotion recognition using data augmentation method by cycle-generative adversarial networks

Arash ShilandariHossein MarviHossein KhosraviWenwu Wang

Journal:   Signal Image and Video Processing Year: 2022 Vol: 16 (7)Pages: 1955-1962
© 2026 ScienceGate Book Chapters — All rights reserved.