JOURNAL ARTICLE

Neural Architecture Search for Speech Emotion Recognition

Xixin WuShoukang HuZhiyong WuXunying LiuHelen Meng

Year: 2022 Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pages: 6902-6906

Abstract

Deep neural networks have brought significant advancements to speech emotion recognition (SER). However, the architecture design in SER is mainly based on expert knowledge and empirical (trial-and-error) evaluations, which is time-consuming and resource intensive. In this paper, we propose to apply neural architecture search (NAS) techniques to automatically configure the SER models. To accelerate the candidate architecture optimization, we propose a uniform path dropout strategy to encourage all candidate architecture operations to be equally optimized. Experimental results of two different neural structures on IEMOCAP show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes. The proposed dropout strategy also shows superiority over the previous approaches.

Keywords:
Dropout (neural networks) Computer science Architecture Artificial neural network Artificial intelligence Deep neural networks Machine learning Speech recognition

Metrics

17
Cited By
2.00
FWCI (Field Weighted Citation Impact)
44
Refs
0.87
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
© 2026 ScienceGate Book Chapters — All rights reserved.