JOURNAL ARTICLE

Clustering-based binary-class classification for imbalanced data sets

Abstract

In this paper, we propose a new clustering-based binary-class classification framework that integrates the clustering technique into a binary-class classification approach to handle the imbalanced data sets. A binary-class classifier is designed to classify a set of data instances into two classes; while the clustering technique partitions the data instances into groups according to their similarity to each other. After applying a clustering algorithm, the data instances within the same group usually have a higher similarity, and the differences among the data instances between different groups should be larger. In our proposed framework, all negative data instances are first clustered into a set of negative groups. Next, the negative data instances in each negative group are combined with all positive data instances to construct a balanced binary-class data set. Finally, subspace models trained on these balanced binary-class data sets are integrated with the subspace model trained on the original imbalanced data set to form the proposed classification model. Experimental results demonstrate that our proposed classification framework performs better than the comparative classification approaches as well as the subspace modeling method trained on the original data set alone.

Keywords:
Cluster analysis Computer science Pattern recognition (psychology) Artificial intelligence Subspace topology Data mining Binary data Data set Class (philosophy) Similarity (geometry) Binary classification Fuzzy clustering Binary number Mathematics Support vector machine

Metrics

32
Cited By
2.74
FWCI (Field Weighted Citation Impact)
18
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

DISSERTATION

A new genetic algorithm based clustering for binary and imbalanced class data sets

Sabariah Saharan

University:   University of Canterbury Research Repository (University of Canterbury) Year: 2016
JOURNAL ARTICLE

Classification with Local Clustering in Imbalanced Data Sets

Hua JiHua Xiang Zhang

Journal:   Advanced materials research Year: 2011 Vol: 219-220 Pages: 151-155
JOURNAL ARTICLE

Classification of minority class in imbalanced data sets

Journal:   Journal of Mathematical and Computational Science Year: 2021
JOURNAL ARTICLE

Imbalanced Data Classification Based on Clustering

Hu LiPeng ZouWei HanRong Xia

Journal:   Applied Mechanics and Materials Year: 2013 Vol: 443 Pages: 741-745
© 2026 ScienceGate Book Chapters — All rights reserved.