JOURNAL ARTICLE

A HYBRID FEATURE SELECTION METHOD FOR TEXT CATEGORIZATION

Elena MontañésJosé Ramón QuevedoElías F. CombarroIrene Dı́azJosé Ranilla

Year: 2007 Journal:   International Journal of Uncertainty Fuzziness and Knowledge-Based Systems Vol: 15 (02)Pages: 133-151   Publisher: World Scientific

Abstract

Feature Selection is an important task within Text Categorization, where irrelevant or noisy features are usually present, causing a lost in the performance of the classifiers. Feature Selection in Text Categorization has usually been performed using a filtering approach based on selecting the features with highest score according to certain measures. Measures of this kind come from the Information Retrieval, Information Theory and Machine Learning fields. However, wrapper approaches are known to perform better in Feature Selection than filtering approaches, although they are time-consuming and sometimes infeasible, especially in text domains. However a wrapper that explores a reduced number of feature subsets and that uses a fast method as evaluation function could overcome these difficulties. The wrapper presented in this paper satisfies these properties. Since exploring a reduced number of subsets could result in less promising subsets, a hybrid approach, that combines the wrapper method and some scoring measures, allows to explore more promising feature subsets. A comparison among some scoring measures, the wrapper method and the hybrid approach is performed. The results reveal that the hybrid approach outperforms both the wrapper approach and the scoring measures, particularly for corpora whose features are less scattered over the categories.

Keywords:
Feature selection Computer science Categorization Artificial intelligence Text categorization Feature (linguistics) Selection (genetic algorithm) Task (project management) Machine learning Pattern recognition (psychology) Function (biology) Data mining Engineering

Metrics

5
Cited By
0.78
FWCI (Field Weighted Citation Impact)
10
Refs
0.77
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Algorithms and Data Compression
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

New Feature Selection Method for Text Categorization

Xingfeng WangHee‐Cheol Kim

Journal:   Journal of information and communication convergence engineering Year: 2017 Vol: 15 (1)Pages: 53-61
BOOK-CHAPTER

An Effective Feature Selection Method for Text Categorization

Xipeng QiuZhou JinlongXuanjing Huang

Lecture notes in computer science Year: 2011 Pages: 50-61
JOURNAL ARTICLE

IGICA: A Hybrid Feature Selection Approach in Text Categorization

Mohammad MojaveriyanHossein Ebrahimpour-KomlehSeyed Jalaleddin Mousavirad

Journal:   International Journal of Intelligent Systems and Applications Year: 2016 Vol: 8 (3)Pages: 42-47
JOURNAL ARTICLE

A two-stage feature selection method for text categorization

Jiana MengHongfei LinYuhai Yu

Journal:   Computers & Mathematics with Applications Year: 2011 Vol: 62 (7)Pages: 2793-2800
© 2026 ScienceGate Book Chapters — All rights reserved.