Saida LaaroussiAbdellah YousfiSi Lhoussain AouraghSaïd Ouatik El Alaoui
Automatic spelling correction is a very important task used in many Natural Language Processing (NLP) applications such as Optical Character Recognition (OCR), Information retrieval, etc.There are many approaches able to detect and correct misspelled words.These approaches can be divided into two main categories: contextual and context-free approaches.In this paper, we propose a new contextual spelling correction method applied to the Arabic language, without loss of generality for other languages.The method is based on both the Viterbi algorithm and a probabilistic model built with a new estimate of n-gram language models combined with the edit distance.The probabilistic model is learned with an Arabic multipurpose corpus.The originality of our work consists in handling up global and simultaneous correction of a set of many erroneous words within sentences.The experiments carried out prove the performance of our proposal, giving encouraging results for the correction of several spelling errors in a given context.The method achieves a correction accuracy of up to 93.6% by evaluating the first given correction suggestion.It is able to take into account strong links between distant words carrying meaning in a given context.The high-level correction accuracy of our method allows for its integration into many applications.
Saida LaaroussiSi Lhoussain AouraghAbdellah YousfiMohammed NejjaHicham GeddahSaïd Ouatik El Alaoui
Abdellah YousfiSi Lhoussain AouraghHicham GueddahNejja Mohamed
Moath R. KhaleelGheith A. Abandah