JOURNAL ARTICLE

Convolutional Neural Networks for Deep Spoken Keyword Spotting

Abstract

With the increase in biometric security applications, mobile and telephonic communication monitoring and digital assistants, the practical applications of Keyword Spotting (KWS) have increased many folds. The use of Artificial Intelligence in the domain of Keyword Spotting has greatly enhanced its accuracy. In this work, after doing analysis of various feature extraction and Deep Learning techniques, KWS is done both in non-streaming mode and streaming mode. The features of the speech are extracted using Mel-Spectograms and Mel-frequency Cepstral Coefficients (MFCCs). Out of three broad categories of Deep Neural networks, Convolutional Neural Network (CNN) model has been implemented for Keyword Spotting as it out-performs Recurrent Neural Network (RNN) and Feedforward Neural Network (FFNN) due to their lesser complexity and low computational cost. These techniques were used with Google Speech Commands Dataset, provided by Google, online as well as offline.

Keywords:
Keyword spotting Computer science Convolutional neural network Artificial intelligence Deep learning Feature extraction Artificial neural network Mel-frequency cepstrum Speech recognition Feedforward neural network Recurrent neural network Spotting Feature (linguistics) Pattern recognition (psychology)

Metrics

3
Cited By
0.77
FWCI (Field Weighted Citation Impact)
40
Refs
0.71
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.