Morpheme based factored language models for German LVCSR

Amr El-Desoky Mousa; M. Ali Basha Shaik; Ralf Schlüter; Hermann Ney

doi:10.21437/interspeech.2011-253

ScienceGate Book Chapters

JOURNAL ARTICLE

Morpheme based factored language models for German LVCSR

Amr El-Desoky Mousa M. Ali Basha Shaik Ralf Schlüter Hermann Ney

Year: 2011 Pages: 1445-1448

DOI: 10.21437/interspeech.2011-253

Get Full-Text PDF Get Analytical Report

Abstract

German is a highly inflectional language, where a large number of words can be generated from the same root.It makes a liberal use of compounding leading to high Out-of-vocabulary (OOV) rates, and poor Language Model (LM) probability estimates.Therefore, the use of morphemes for language modeling is considered a better choice for Large Vocabulary Continuous Speech Recognition (LVCSR) than the full-words.Thereby, better lexical coverage and less LM perplexities are achieved.On the other side, the use of Factored Language Models (FLMs) is considered a successful approach that allows the integration of many information sources to get better LM probability estimates.In this paper, we try a combined methodology for language modeling where both morphological decomposition and factored language modeling are used in one model called morpheme based FLM.Finally, we obtain around 2.5% relative reduction in Word Error Rate (WER) with respect to a traditional full-words system.

Keywords:

Morpheme Computer science Language model Vocabulary Word error rate Natural language processing Artificial intelligence German Speech recognition Word (group theory) Cache language model Linguistics Natural language Universal Networking Language

Metrics

Cited By

2.74

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Morpheme based factored language models for German LVCSR

Abstract

Metrics

Citation History

Topics

Related Documents

Morpheme level feature-based language models for German LVCSR

Morpheme-Based Language Modeling for Arabic Lvcsr

Factored language modeling for Russian LVCSR

Sub-lexical language models for German LVCSR

Sub-word Language Models for German LVCSR