In this paper, the recognition performances of several methodologies proposed in the context of Turkish Large Vocabulary Continuous Speech Recognition are retrieved by using a limited audio corpus. Word based, stem based, stem-ending based, and morph based language models are utilized with different n-gram orders. Word based and stem-ending based language models are extended by using several approaches. Also, a hybrid language model which is based on word based and stem-ending based language models is proposed. Word based language model is observed to outperform sub-word language models when limited audio corpus is used.
T. MatsuokaKatsutoshi OhtsukiTakeshi MoriSadaoki FuruiKoun Shirai
T. MatsuokaKatsutoshi OhtsukiTakeshi MoriKotaro YoshidaSadaoki FuruiKoun Shirai
Tatsuo MatsuokaKatsutoshi OhtsukiTakeshi MoriSadaoki FuruiKatsuhiko Shirai