Language model adaptation using word clustering

Shinsuke Mori; Masafumi Nishimura; Nobuyasu Itoh

doi:10.21437/eurospeech.2003-161

ScienceGate Book Chapters

JOURNAL ARTICLE

Language model adaptation using word clustering

Shinsuke Mori Masafumi Nishimura Nobuyasu Itoh

Year: 2003 Pages: 425-428

DOI: 10.21437/eurospeech.2003-161

Get Full-Text PDF Get Analytical Report

Abstract

Building a stochastic language model (LM) for speech recognition requires a large corpus of target tasks. For some tasks no enough large corpus is available and this is an obstacle to achieving high recognition accuracy. In this paper, we propose a methodforbuildinganLMwithahigherpredictionpowerusing large corpora from different tasks rather than an LM estimated from a small corpus for a specific target task. In our experiment, weusedtranscriptionsofairuniversitylecturesandarticlesfrom Nikkei newspaper and compared an existing interpolation-based method and our new method. The results show that our new method reduces perplexity by 9.71%.

Keywords:

Perplexity Computer science Artificial intelligence Language model Task (project management) Cluster analysis Natural language processing Adaptation (eye) Speech recognition Word (group theory) Interpolation (computer graphics) Newspaper Obstacle Linguistics

Metrics

Cited By

0.38

FWCI (Field Weighted Citation Impact)

Refs

0.66

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Language model adaptation using word clustering

Abstract

Metrics

Citation History

Topics

Related Documents

Learning Latent Word Representations for Domain Adaptation using Supervised Word Clustering

Class-based language model adaptation using mixtures of word-class weights

Vari-gram language model based on word clustering

Rapid acoustic model development using Gaussian mixture clustering and language adaptation

A stochastic language model using dependency and its improvement by word clustering