JOURNAL ARTICLE

Firefly Algorithm based Feature Selection for Arabic Text Classification

Souad Larabi-Marie-SainteNada Alalyani

Year: 2018 Journal:   Journal of King Saud University - Computer and Information Sciences Vol: 32 (3)Pages: 320-328   Publisher: Elsevier BV

Abstract

Due to the large number of documents available in the internet, emails and digital libraries, document classification is becoming a crucial task extremely required. It is commonly achieved after performing feature selection that consists of selecting appropriate features to enhance the classification accuracy. Most of feature selection based text classification methods rely on building a term-frequency inverse-document frequency feature vector which is not usually efficient. In addition, numerous document classification studies are focused on English language. This paper deals with Arabic Text Classification which is not intensively studied due to the complexity of Arabic language. A new firefly algorithm based feature selection method is proposed. This algorithm has been successfully applied in different combinatorial problems. However, it has not been involved in feature selection concept to deal with Arabic Text Classification. To validate this technique, Support Vector Machine classifier is used as well as three evaluation measures including precision, recall and F-measure. Furthermore, experiments on OSAC real dataset along with a comparison with the state-of-the-art methods are performed. The proposed method achieves a precision value equals to 0.994. The results confirm the efficiency of the proposed feature selection method in improving Arabic Text Classification accuracy. Keywords: Arabic Natural Language Processing, Feature Selection, Firefly optimization method, Text Classification

Keywords:
Computer science Feature selection Artificial intelligence Classifier (UML) Support vector machine Firefly algorithm Pattern recognition (psychology) Feature vector Selection (genetic algorithm) Feature (linguistics) Data mining Vector space model Machine learning Natural language processing Particle swarm optimization

Metrics

136
Cited By
9.53
FWCI (Field Weighted Citation Impact)
43
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.