Improving broadcast news transcription by lightly supervised discriminative training

Ho Yin Chan; Philip C. Woodland

doi:10.1109/icassp.2004.1326091

ScienceGate Book Chapters

JOURNAL ARTICLE

Improving broadcast news transcription by lightly supervised discriminative training

Ho Yin Chan Philip C. Woodland

Year: 2004 Vol: 1 Pages: I-737

DOI: 10.1109/icassp.2004.1326091

Get Full-Text PDF Get Analytical Report

Abstract

We present our experiments on lightly supervised discriminative training with large amounts of broadcast news data for which only closed caption transcriptions are available (TDT data). In particular, we use language models biased to the closed-caption transcripts to recognise the audio data, and the recognised transcripts are then used as the training transcriptions for acoustic model training. A range of experiments that use maximum likelihood (ML) training as well as discriminative training based on either maximum mutual information (MMI) or minimum phone error (MPE) are presented. In a 5xRT broadcast news transcription system that includes adaptation, it is shown that reductions in word error rate (WER) in the range of 1% absolute can be achieved. Finally, some experiments on training data selection are presented to compare different methods of "filtering" the transcripts.

Keywords:

Discriminative model Computer science Word error rate Transcription (linguistics) Speech recognition Phone Mutual information Selection (genetic algorithm) Training (meteorology) Artificial intelligence

Metrics

Cited By

7.72

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Improving broadcast news transcription by lightly supervised discriminative training

Abstract

Metrics

Citation History

Topics

Related Documents

Improving lightly supervised training for broadcast transcription

Lightly supervised and data-driven approaches to Mandarin broadcast news transcription

Improving broadcast news transcription with a precision grammar and discriminative reranking

Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training

Lightly supervised training for risk-based discriminative language models