Lexical triggers and latent semantic analysis for cross-lingual language model adaptation

Woosung Kim; Sanjeev Khudanpur

doi:10.1145/1034780.1034782

ScienceGate Book Chapters

JOURNAL ARTICLE

Lexical triggers and latent semantic analysis for cross-lingual language model adaptation

Woosung Kim Sanjeev Khudanpur

Year: 2004 Journal: ACM Transactions on Asian Language Information Processing Vol: 3 (2)Pages: 94-112 Publisher: Association for Computing Machinery

DOI: 10.1145/1034780.1034782

Get Full-Text PDF Get Analytical Report

Abstract

In-domain texts for estimating statistical language models are not easily found for most languages of the world. We present two techniques to take advantage of in-domain text resources in other languages. First, we extend the notion of <i>lexical triggers</i>, which have been used monolingually for language model adaptation, to the cross-lingual problem, permitting the construction of sharper language models for a target-language document by drawing statistics from related documents in a resource-rich language. Next, we show that <i>cross-lingual latent semantic analysis</i> is similarly capable of extracting useful statistics for language modeling. Neither technique requires explicit translation capabilities between the two languages! We demonstrate significant reductions in both perplexity and word error rate on a Mandarin speech recognition task by using these techniques.

Keywords:

Perplexity Computer science Natural language processing Artificial intelligence Language model Latent semantic analysis Mandarin Chinese Machine translation Word (group theory) Cache language model Adaptation (eye) Linguistics Natural language Universal Networking Language Comprehension approach

Metrics

Cited By

3.09

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Lexical triggers and latent semantic analysis for cross-lingual language model adaptation

Abstract

Metrics

Citation History

Topics

Related Documents

Cross-lingual latent semantic analysis for language modeling

Cross-lingual latent semantic analysis

Cross-lingual lexical triggers in statistical language modeling

Cross-Lingual Adaptation for Vision-Language Model via Multimodal Semantic Distillation

Language model adaptation using cross-lingual information