JOURNAL ARTICLE

AutoSpeech: Neural Architecture Search for Speaker Recognition

Abstract

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.However, these backbones were originally proposed for image classification, and therefore may not be naturally fit for speaker recognition.Due to the prohibitive complexity of manually exploring the design space, we propose the first neural architecture search approach for the speaker recognition tasks, named as AutoSpeech.Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times.The final speaker recognition model can be obtained by training the derived CNN model through the standard scheme.To evaluate the proposed approach, we conduct experiments on both speaker identification and speaker verification tasks using the VoxCeleb1 dataset.Results demonstrate that the derived CNN architectures from the proposed approach significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 backbones, while enjoying lower model complexity.

Keywords:
Computer science Speech recognition Speaker recognition Architecture Artificial neural network Artificial intelligence Pattern recognition (psychology)

Metrics

50
Cited By
5.87
FWCI (Field Weighted Citation Impact)
38
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

EfficientTDNN: Efficient Architecture Search for Speaker Recognition

Rui WangZhihua WeiHaoran DuanShouling JiYang LongZhen Hong

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2022 Vol: 30 Pages: 2267-2279
JOURNAL ARTICLE

Efficient neural architecture search for emotion recognition

Monu VermaMurari MandalSatish Kumar ReddyYashwanth Reddy MeedimaleSantosh Kumar Vipparthi

Journal:   Expert Systems with Applications Year: 2023 Vol: 224 Pages: 119957-119957
JOURNAL ARTICLE

Neural Architecture Search for Speech Emotion Recognition

Xixin WuShoukang HuZhiyong WuXunying LiuHelen Meng

Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Year: 2022 Pages: 6902-6906
JOURNAL ARTICLE

Teacher Guided Neural Architecture Search for Face Recognition

Xiaobo Wang

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2021 Vol: 35 (4)Pages: 2817-2825
© 2026 ScienceGate Book Chapters — All rights reserved.