End-to-End Speech Keyword Spotting System

Shenghua Hu; Hanyue Liu; Liang Xu; Jing Wang; Yujun Wang; Peng Gao; Weiji Zhuang

doi:10.1109/iccsi58851.2023.10304048

ScienceGate Book Chapters

JOURNAL ARTICLE

End-to-End Speech Keyword Spotting System

Shenghua Hu Hanyue Liu Liang Xu Jing Wang Yujun Wang Peng Gao Weiji Zhuang

Year: 2023 Vol: 11 Pages: 215-220

DOI: 10.1109/iccsi58851.2023.10304048

Get Full-Text PDF Get Analytical Report

Abstract

The purpose of speech keyword spotting is to detect a set of predefined keywords from a continuous speech signal stream. Based on the research on end-to-end technologies in the field of deep learning, this paper designs and implements an end-to-end speech keyword spotting algorithm, which has a wide range of applications in various fields, such as smartphones and automobiles. The algorithm first trains an acoustic model based on a deep neural network, which receives the acoustic features and outputs the posterior probability of the wake-up word. Then, the posterior probability is smoothed to obtain the confidence score of the wake-up word. Through the above process, the traditional decoding process can be avoided effectively. In addition, this paper compares various neural network structures of acoustic model, such as the time-delay neural network (TDNN) and the factorized time-delay neural network (TDNN-F). Through comparative experiments by controlling variables, it is verified that the proposed end-to-end speech keyword spotting algorithm has competitive performance compared with the other popular technologies.

Keywords:

Keyword spotting Spotting Computer science Artificial neural network Speech recognition Decoding methods Time delay neural network Word (group theory) End-to-end principle Speech processing Process (computing) Acoustic model Voice activity detection Artificial intelligence Algorithm

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.15

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

End-to-End Speech Keyword Spotting System

Abstract

Metrics

Topics

Related Documents

End-to-End Multi-Look Keyword Spotting

Metadata-Aware End-to-End Keyword Spotting

End-to-end Keyword Spotting using Xception-1d

An End-to-End Far-Field Keyword Spotting System with Neural Beamforming

VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting