JOURNAL ARTICLE

Research on Voice Wake Based on Depthwise Separable Convolutional Neural Network

Abstract

Aiming at the study of voice wake-up, this paper builds a 12-layer deep separable convolutional neural network- DSCNN based on deep separable convolutions. It determines whether wake words are recognized by binary classification of the feature spectrum after feature extraction. Choosing, 'HelloMia" as the wake-up word, the training set contains 7982 positive sample speeches with the label (1,0), negative sample speech 1315 with the label $(0,1)$ , by introducing the batch normalization layer (BN layer), the model converges at 0.3 epochs, the accuracy rate is 0.9994 on the test set of 10,000 positive samples, and the accuracy rate is 0.9889 on the test set of 2362 negative samples. The wake-up rate is 99.94%, and the false wake-up rate is only 1.11%. Compared with ordinary convolutional models, it is found that DSCNN greatly reduces the number of parameters and memory consumption, while the convergence speed and training effect have not decreased.

Keywords:
Computer science Wake Convolutional neural network Separable space Speech recognition Artificial intelligence Engineering Mathematics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
7
Refs
0.10
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Gait Recognition and Analysis
Physical Sciences →  Engineering →  Biomedical Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.