JOURNAL ARTICLE

Language model adaptation using cross-lingual information

Abstract

The success of statistical language modeling techniques is crucially dependent on the availability of a large amount training text. For a language in which such large text collections are not available, methods have recently been proposed to take advantage of a resource-rich language, together with cross-lingual information retrieval and machine translation, to sharpen language models for the resource-deficient language. In this paper, we describe investigations into such language models for an automatic speech recognition system for Mandarin Broadcast News. By exploiting a large side-corpus of contemporaneous English news articles to adapt a static Chinese language model to the news story being transcribed, we demonstrate significant improvements in recognition accuracy. The improvement from using English text is greater when less Chinese text is available to estimate the static language model. We also compare our cross-lingual adaptation to monolingual topic-dependent language model adaptation, and achieve further gains by combining the two adaptation techniques.

Keywords:
Computer science Language model Adaptation (eye) Mandarin Chinese Natural language processing Machine translation Cache language model Artificial intelligence Resource (disambiguation) Universal Networking Language Speech recognition Natural language Linguistics Comprehension approach

Metrics

11
Cited By
1.15
FWCI (Field Weighted Citation Impact)
5
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.