JOURNAL ARTICLE

A Feature Selection and Classification Technique for Text Categorization

Moheb R. GirgisAshraf A. Aly

Year: 2003 Journal:   International Journal of Cooperative Information Systems Vol: 12 (04)Pages: 441-454   Publisher: World Scientific

Abstract

Text categorization is the automated assigning of documents to predefined categories based on their contents. It involves two main tasks — feature selection and document classification. This paper discusses the weak points of the text categorization technique developed by Maron and modified by Lewis. Then, it introduces a technique for text categorization that uses new formulas for feature selection and document classification. These formulas have been formulated to overcome the weak points of Maron's and Lewis' techniques. Also, the paper describes the design of an experimental text categorization system that is composed of the same set of processes as the MAXCAT system developed by Lewis. The paper presents and analyses the results of applying the system on a set of training and test documents by using Lewis' and the proposed formulas. In addition, a method for separately evaluating the effectiveness of feature selection is given. Finally, the impact of the feature set size on the effectiveness of the classification system is investigated, using the system and applying one of the proposed classification formulas with different feature set sizes.

Keywords:
Feature selection Categorization Computer science Text categorization Artificial intelligence Set (abstract data type) Feature (linguistics) Selection (genetic algorithm) Machine learning Document classification Pattern recognition (psychology) Data mining

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
6
Refs
0.18
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Effective feature selection technique for text classification

Hari SeethaM. Narasimha MurtyR. Saravanan

Journal:   International Journal of Data Mining Modelling and Management Year: 2015 Vol: 7 (3)Pages: 165-165
JOURNAL ARTICLE

TEXT CATEGORIZATION BASED ON FEATURE SELECTION TECHNIQUE AND BGA ALGORITHM

Journal:   International Journal of Advance Engineering and Research Development Year: 2016 Vol: 3 (05)
BOOK-CHAPTER

Feature Selection Strategies for Text Categorization

Pascal SoucyGuy W. Mineau

Lecture notes in computer science Year: 2003 Pages: 505-509
BOOK-CHAPTER

A Novel Feature Selection Technique for Text Classification

D. S. GuruMostafa Z. AliMahamad Suhil

Advances in intelligent systems and computing Year: 2018 Pages: 721-733
© 2026 ScienceGate Book Chapters — All rights reserved.