Towards improving Khoja rule-based Arabic stemmer

Mohammed N. Al‐Kabi

doi:10.1109/aeect.2013.6716437

ScienceGate Book Chapters

JOURNAL ARTICLE

Towards improving Khoja rule-based Arabic stemmer

Mohammed N. Al‐Kabi

Year: 2013 Pages: 1-6

DOI: 10.1109/aeect.2013.6716437

Get Full-Text PDF Get Analytical Report

Abstract

Stemming algorithms are used to remove irrelevant morphological variations from different words, and extract the stem or the root from which the inputted word is derived. Stemming can then help to standardize terms referring to the same concept. These algorithms are widely used in information retrieval systems and Web search engines, in addition to other systems such as: Machine translation, text clustering, text summarization, question answering, indexing, text mining, text classification… etc. Khoja stemmer is a standard Arabic stemmer, which has a number of flaws. Previous studies and this one show that Khoja stemmer is better than other two competitive ones evaluated in this study. The Khoja stemmer and the other two evaluated Arabic stemmers depend mainly in their work on (Patterns, Forms, "***"). Therefore the identification of the flaws leads to identification of missing Patterns not used by Khoja stemmer. So the enhancement to Khoja stemmer is restricted to adding missing patterns, and this leads to around 5% improvement to the accuracy of Khoja stemmer.

Keywords:

Arabic Computer science Natural language processing Artificial intelligence Linguistics Philosophy

Metrics

Cited By

3.30

FWCI (Field Weighted Citation Impact)

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and dialogue systems

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Towards improving Khoja rule-based Arabic stemmer

Abstract

Metrics

Citation History

Topics

Related Documents

Improving Arabic Stemmer: ISRI Stemmer

Arabic Light Stemming: A Comparative Study between P-Stemmer, Khoja Stemmer, and Light10 Stemmer

Rule merging in a rule-based Arabic stemmer

A Rule-Based Subject-Correlated Arabic Stemmer

Enhanced semantic arabic Question Answering system based on Khoja stemmer and AWN