JOURNAL ARTICLE

Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition

Abstract

In this paper, we describe a pronunciation lexicon model which is especially useful for constructing morpheme-based pronunciation lexicon to improve the performance of a Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. For modeling of cross-morpheme pronunciation variations, we usually used a context-dependent multiple pronunciation lexicon with possible multiple phonetic transcriptions for each word. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished phonological rules that can be applied to phonemes in withinmorpheme and cross-morpheme. However, pronunciation variations in morpheme boundaries are increasing the lexicon size; we have designed the optimized pronunciation lexicon which is decreasing the confusability and increasing pronunciation coverage. The results of Korean Broadcast News Transcription experiments show that a reduction of 18% in pronunciation lexicon size and an absolute reduction of 0.27% in WER from the same lexical entries were achieved by building a proposed pronunciation lexicon.

Keywords:
Pronunciation Lexicon Morpheme Computer science Context (archaeology) Speech recognition Natural language processing Artificial intelligence Linguistics Vocabulary History

Metrics

3
Cited By
0.00
FWCI (Field Weighted Citation Impact)
7
Refs
0.03
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Phonetics and Phonology Research
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.