AutoSpeech: Neural Architecture Search for Speaker Recognition

Shaojin Ding; Tianlong Chen; Xinyu Gong; Weiwei Zha; Shuicheng Yan

doi:10.21437/interspeech.2020-1258

ScienceGate Book Chapters

JOURNAL ARTICLE

AutoSpeech: Neural Architecture Search for Speaker Recognition

Shaojin Ding Tianlong Chen Xinyu Gong Weiwei Zha Shuicheng Yan

Year: 2020 Pages: 916-920

DOI: 10.21437/interspeech.2020-1258

Get Full-Text PDF Get Analytical Report

Abstract

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.However, these backbones were originally proposed for image classification, and therefore may not be naturally fit for speaker recognition.Due to the prohibitive complexity of manually exploring the design space, we propose the first neural architecture search approach for the speaker recognition tasks, named as AutoSpeech.Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times.The final speaker recognition model can be obtained by training the derived CNN model through the standard scheme.To evaluate the proposed approach, we conduct experiments on both speaker identification and speaker verification tasks using the VoxCeleb1 dataset.Results demonstrate that the derived CNN architectures from the proposed approach significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 backbones, while enjoying lower model complexity.

Keywords:

Computer science Speech recognition Speaker recognition Architecture Artificial neural network Artificial intelligence Pattern recognition (psychology)

Metrics

Cited By

5.87

FWCI (Field Weighted Citation Impact)

Refs

0.96

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

AutoSpeech: Neural Architecture Search for Speaker Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

EfficientTDNN: Efficient Architecture Search for Speaker Recognition

Efficient neural architecture search for emotion recognition

Neural Architecture Search for Speech Emotion Recognition

Automatic Modulation Recognition Using Neural Architecture Search

Teacher Guided Neural Architecture Search for Face Recognition