JOURNAL ARTICLE

Effective feature selection technique for text classification

Hari SeethaM. Narasimha MurtyR. Saravanan

Year: 2015 Journal:   International Journal of Data Mining Modelling and Management Vol: 7 (3)Pages: 165-165   Publisher: Inderscience Publishers

Abstract

Text classification plays a vital role in the organisation of the unceasing growth of digital documents. High dimensionality of feature space is a major hassle in text classification. Feature selection, an effective preprocessing technique improves the computational efficiency and the accuracy of a text classifier. In the present paper, text classification is performed with Zipf's law-based feature selection and the use of linear SVM weight for feature ranking. A hybrid feature selection method combining these two feature selection techniques is proposed. Nearest neighbour and SVM classifiers are chosen as text classifiers for their good classification accuracy reported in many text classification tasks. Moreover, to investigate the effect of kernel type on the text classification both linear and non-linear kernels in SVM are examined. The performance is evaluated by determining classification accuracy using ten-fold cross-validation. Experimental results with four benchmark corpuses were encouraging and demonstrated that the classification performance using hybrid feature selection method outperformed the classification performance obtained by selecting either medium frequent features based on Zipf's law or using feature selection by linear SVM.

Keywords:
Feature selection Artificial intelligence Support vector machine Linear classifier Pattern recognition (psychology) Computer science Preprocessor Classifier (UML) Zipf's law Curse of dimensionality Feature (linguistics) Machine learning Boosting (machine learning) Feature vector k-nearest neighbors algorithm Data mining Mathematics Statistics

Metrics

5
Cited By
1.57
FWCI (Field Weighted Citation Impact)
34
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

A Novel Feature Selection Technique for Text Classification

D. S. GuruMostafa Z. AliMahamad Suhil

Advances in intelligent systems and computing Year: 2018 Pages: 721-733
JOURNAL ARTICLE

Feature Selection for Effective Text Classification using Semantic Information

Rajul K. JainNitin Pise

Journal:   International Journal of Computer Applications Year: 2015 Vol: 113 (10)Pages: 18-25
JOURNAL ARTICLE

A Feature Selection and Classification Technique for Text Categorization

Moheb R. GirgisAshraf A. Aly

Journal:   International Journal of Cooperative Information Systems Year: 2003 Vol: 12 (04)Pages: 441-454
© 2026 ScienceGate Book Chapters — All rights reserved.