Vietnamese large vocabulary continuous speech recognition

Ngoc Thang Vu; Tanja Schultz

doi:10.1109/asru.2009.5373424

ScienceGate Book Chapters

JOURNAL ARTICLE

Vietnamese large vocabulary continuous speech recognition

Ngoc Thang Vu Tanja Schultz

Year: 2009 Pages: 333-338

DOI: 10.1109/asru.2009.5373424

Get Full-Text PDF Get Analytical Report

Abstract

We report on our recent efforts toward a large vocabulary Vietnamese speech recognition system. In particular, we describe the Vietnamese text and speech database recently collected as part of our GlobalPhone corpus. The data was complemented by a large collection of text data crawled from various Vietnamese websites. To bootstrap the Vietnamese speech recognition system we used our Rapid Language Adaptation scheme applying a multilingual phone inventory. After initialization we investigated the peculiarities of the Vietnamese language and achieved significant improvements by implementing different tone modeling schemes, extended by pitch extraction, handling multiwords to address the monosyllable structure of Vietnamese, and featuring language modeling based on 5-grams. Furthermore, we addressed the issue of dialectal variations between South and North Vietnam by creating dialect dependent pronunciations and including dialect in the context decision tree of the recognizer. Our currently best recognition system achieves a word error rate of 11.7% on read newspaper speech.

Keywords:

Vietnamese Computer science Speech recognition Vocabulary Natural language processing Artificial intelligence Linguistics

Metrics

Cited By

3.81

FWCI (Field Weighted Citation Impact)

Refs

0.95

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Phonetics and Phonology Research

Social Sciences → Psychology → Experimental and Cognitive Psychology

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Vietnamese large vocabulary continuous speech recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Vietnamese large vocabulary continuous speech recognition

Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions

Research in large vocabulary continuous speech recognition

Large Vocabulary Continuous Audio-Visual Speech Recognition

Advances in Large Vocabulary Continuous Speech Recognition