Syllable-based large vocabulary continuous speech recognition

Aravind Ganapathiraju; J. Hamaker; J. Picone; M. Ordowski; George R. Doddington

doi:10.1109/89.917681

ScienceGate Book Chapters

JOURNAL ARTICLE

Syllable-based large vocabulary continuous speech recognition

Aravind Ganapathiraju J. Hamaker J. Picone M. Ordowski George R. Doddington

Year: 2001 Journal: IEEE Transactions on Speech and Audio Processing Vol: 9 (4)Pages: 358-366 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/89.917681

Get Full-Text PDF Get Analytical Report

Abstract

Most large vocabulary continuous speech recognition (LVCSR) systems in the past decade have used a context-dependent (CD) phone as the fundamental acoustic unit. We present one of the first robust LVCSR systems that uses a syllable-level acoustic unit for LVCSR on telephone-bandwidth speech. This effort is motivated by the inherent limitations in phone-based approaches-namely the lack of an easy and efficient way for modeling long-term temporal dependencies. A syllable unit spans a longer time frame, typically three phones, thereby offering a more parsimonious framework for modeling pronunciation variation in spontaneous speech. We present encouraging results which show that a syllable-based system exceeds the performance of a comparable triphone system both in terms of word error rate (WER) and complexity. The WER of the best syllabic system reported here is 49.1% on a standard Switchboard evaluation, a small improvement over the triphone system. We also report results on a much smaller recognition task, OGI Alphadigits, which was used to validate some of the benefits syllables offer over triphones. The syllable-based system exceeds the performance of the triphone system by nearly 20%, an impressive accomplishment since the alphadigits application consists mostly of phone-level minimal pair distinctions.

Keywords:

Computer science Speech recognition Syllable Vocabulary Word error rate Phone Pronunciation Syllabic verse Context (archaeology) Task (project management) Artificial intelligence Natural language processing Linguistics

Metrics

124

Cited By

4.83

FWCI (Field Weighted Citation Impact)

Refs

0.95

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Phonetics and Phonology Research

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Syllable-based large vocabulary continuous speech recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Syllable Based Language Model for Large Vocabulary Continuous Speech Recognition of Polish

Continuous speech recognition using large vocabulary word spotting and CV syllable spotting

Compact subnetwork-based large vocabulary continuous speech recognition

Syllable-Phoneme based Continuous Speech Recognition

Vietnamese large vocabulary continuous speech recognition