JOURNAL ARTICLE

Investigating lightly supervised acoustic model training

Abstract

The last decade has witnessed substantial progress in speech recognition technology, with todays state-of-the-art systems being able to transcribe broadcast audio data with a word error of about 20%. However, acoustic model development for the recognizers requires large corpora of manually transcribed training data. Obtaining such data is both time-consuming and expensive, requiring trained human annotators with substantial amounts of supervision. We describe some experiments using different levels of supervision for acoustic model training in order to reduce the system development cost. The experiments have been carried out using the DARPA TDT-2 corpus (also used in the SDR99 and SDR00 evaluations). Our experiments demonstrate that light supervision is sufficient for acoustic model development, drastically reducing the development cost.

Keywords:
Computer science Acoustic model Speech recognition Training set Data modeling Training (meteorology) Word error rate Artificial intelligence Machine learning Natural language processing Speech processing Database

Metrics

44
Cited By
4.83
FWCI (Field Weighted Citation Impact)
18
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Lightly supervised and unsupervised acoustic model training

Lori LamelJean‐Luc GauvainGilles Adda

Journal:   Computer Speech & Language Year: 2002 Vol: 16 (1)Pages: 115-129
DISSERTATION

Speech Recognition Enhanced by Lightly-supervised and Semi-supervised Acoustic Model Training

Sheng Li

University:   Kyoto University Research Information Repository (Kyoto University) Year: 2016
© 2026 ScienceGate Book Chapters — All rights reserved.