BOOK-CHAPTER

Noisy-Channel Spelling Correction Models for Estonian Learner Language Corpus Lemmatisation

Abstract

Morphological analysis is an important task in Estonian learner language studies that gives information about the words and forms used by the learners. Because of the spelling errors frequently occurring in language learner texts, these texts should undergo some error correction step before applying the conventional morphological analysis tools because the morphological analyser fails to find the correct analysis for the misspelled words. In this paper we compare several different spelling correction models with the aim of improving the lemmatisation accuracy of learner language texts. Experiments show that the simplest non-word noisy-channel spelling correction model with a disambiguation model applied on top of the morphological analyser output performs the best while some of the more complicated models even fail to beat the baseline that does not include any spelling correction.

Keywords:
Spelling Estonian Computer science Natural language processing Linguistics Artificial intelligence Channel (broadcasting) Speech recognition Philosophy Telecommunications

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.23
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and dialogue systems
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.