JOURNAL ARTICLE

The Effect of Combining Different Feature Selection Methods on Arabic Text Classification

Abstract

Feature selection is one of several factors affecting text classification systems. Feature selection aims to choose a representative subset of all features to reduce the complexity of classification problems. Usually a single method is used for feature selection. For English, several attempts were reported examining the combination of different feature selection methods. To the best of our knowledge no such attempts were reported for Arabic text classification. In this study, we examined the effect of combining five feature selection methods, namely CHI, IG, GSS, NGL and RS, on Arabic text classification accuracy. Two approaches of combination were used, intersection (AND) and union (OR). The NB classification algorithm was used to classify a Saudi Press Agency dataset which comprised 6,300 texts divided evenly into six classes. Three feature representation schemas were used, namely Boolean, TFiDF and LTC. The experiments show slight improvement in classification accuracy for combining two and three feature selection methods. No improvement on classification accuracy was seen when four or all five feature selection methods were combined.

Keywords:
Feature selection Artificial intelligence Computer science Pattern recognition (psychology) Feature (linguistics) Selection (genetic algorithm) Feature extraction Statistical classification Document classification Data mining Machine learning

Metrics

15
Cited By
1.89
FWCI (Field Weighted Citation Impact)
15
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.