JOURNAL ARTICLE

Optimized Approach of Feature Selection Based on Information Gain

Abstract

Text feature selection is the key technology in text classification and text information retrieval. The feature selection method - information gain - has extensive application in text categorization. This paper theoretically analyzed the deficiency of information gain in feature selection methods, and then introduced two improvement factors which were LDFWF (Limiting Document Frequency's Word Frequency) and DI (Distribution Information), on this basis an improved text feature selection method was proposed. In this paper, the experiments used the SVM classifier for text classification, text feature selection methods respectively used information gain and the improved information gain that this paper proposed, the results show that the method effectively improve the accuracy of text classification.

Keywords:
Information gain Feature selection Computer science Text categorization Artificial intelligence Classifier (UML) Information gain ratio Feature (linguistics) Selection (genetic algorithm) Pattern recognition (psychology) Support vector machine Mutual information Data mining

Metrics

29
Cited By
5.53
FWCI (Field Weighted Citation Impact)
13
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems
Advanced Computational Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.