Stemming algorithms are used to remove irrelevant morphological variations from different words, and extract the stem or the root from which the inputted word is derived. Stemming can then help to standardize terms referring to the same concept. These algorithms are widely used in information retrieval systems and Web search engines, in addition to other systems such as: Machine translation, text clustering, text summarization, question answering, indexing, text mining, text classification… etc. Khoja stemmer is a standard Arabic stemmer, which has a number of flaws. Previous studies and this one show that Khoja stemmer is better than other two competitive ones evaluated in this study. The Khoja stemmer and the other two evaluated Arabic stemmers depend mainly in their work on (Patterns, Forms, "***"). Therefore the identification of the flaws leads to identification of missing Patterns not used by Khoja stemmer. So the enhancement to Khoja stemmer is restricted to adding missing patterns, and this leads to around 5% improvement to the accuracy of Khoja stemmer.
Mochamad Gilang SyariefOpik Taupik KurahmanArief Fatchul HudaWahyudin Darmalaksana
Tarek KananOdai SadaqaAshraf AlmhiratEmran Kanan
Ibrahim A. Al KharashiImad A. Al Sughaiyer
Mahmoud EldefrawyYasser El-SonbatyNahla A. Belal
Noha S. FareedHamdy M. MousaAshraf B. El-Sisi