Abstract

The field of conversational agents is growing fast and there is an increasing need for algorithms that enhance natural interaction.In this work we show how we achieved state of the art results in the Keyword Spotting field by adapting and tweaking the Xception algorithm, which achieved outstanding results in several computer vision tasks.We obtained about 96% accuracy when classifying audio clips belonging to 35 different categories, beating human annotation at the most complex tasks proposed.

Keywords:
Keyword spotting Computer science Spotting End-to-end principle Computer vision Artificial intelligence Speech recognition

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
20
Refs
0.13
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and dialogue systems
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.