Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Janani Balakumar; S. Vijayarani Mohan

doi:10.1108/idd-09-2018-0045

ScienceGate Book Chapters

JOURNAL ARTICLE

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Janani Balakumar S. Vijayarani Mohan

Year: 2019 Journal: Information Discovery and Delivery Vol: 47 (3)Pages: 154-170 Publisher: Emerald Publishing Limited

DOI: 10.1108/idd-09-2018-0045

Get Full-Text PDF Get Analytical Report

Abstract

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ 2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Keywords:

Feature selection Computer science Support vector machine Artificial intelligence Data mining Selection (genetic algorithm) Benchmark (surveying) Classifier (UML) Machine learning Set (abstract data type) Feature (linguistics) Statistical classification Curse of dimensionality

Metrics

Cited By

1.23

FWCI (Field Weighted Citation Impact)

Refs

0.84

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Spam and Phishing Detection

Physical Sciences → Computer Science → Information Systems

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Abstract

Metrics

Citation History

Topics

Related Documents

Support Vector Machine Text Classification System: Using Ant Colony Optimization Based Feature Subset Selection

Artificial Flora Algorithm-Based Feature Selection With Support Vector Machine for Cardiovascular Disease Classification

An improved support vector machine classifier based on artificial bee colony algorithm

Multiclass Classification of Brain Cancer with Multiple Multiclass Artificial Bee Colony Feature Selection and Support Vector Machine

Utilizing Artificial Bee Colony Algorithm as Feature Selection Method in Arabic Text Classification