BOOK-CHAPTER

Clustering based random over-sampling examples for learning from binary class imbalanced data sets

Abstract

The data imbalance problem has become a challenge in many real-life classification applications. Although numerous synthetic over-sampling techniques have been put forward to alleviate this problem, most of them do not consider the distribution of the minority examples and may generate noisy synthetic minority examples which overlap the majority examples. In this regard, an improved synthetic over-sampling algorithm, named Clustering Based Random Over-Sampling Examples (CBROSE) algorithm, for balancing the binary class data sets is presented in this paper. CBROSE generates synthetic minority examples by combining Kmeans clustering algorithm with the basic mechanism of existing synthetic over-sampling methods. The synthetic minority examples created by CBROSE always be located in an elliptical area centered at the observed minority example. The experimental results based on 5-folder cross validation show the effective-ness of CBROSE on some real-life data sets in terms of AUC.

Keywords:
Cluster analysis Class (philosophy) Computer science Binary number Artificial intelligence Data mining Machine learning Mathematics

Metrics

1
Cited By
0.29
FWCI (Field Weighted Citation Impact)
1
Refs
0.53
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Electricity Theft Detection Techniques
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

A New Over-Sampling Approach: Random-SMOTE for Learning from Imbalanced Data Sets

Yanjie DongXuehua Wang

Lecture notes in computer science Year: 2011 Pages: 343-352
DISSERTATION

A new genetic algorithm based clustering for binary and imbalanced class data sets

Sabariah Saharan

University:   University of Canterbury Research Repository (University of Canterbury) Year: 2016
JOURNAL ARTICLE

A Novel Over-Sampling Method Based on EDAs for Learning from Imbalanced Data Sets

Wei LiuZhang Dong-meiYang Li

Journal:   Journal of Convergence Information Technology Year: 2011 Vol: 6 (11)Pages: 237-247
© 2026 ScienceGate Book Chapters — All rights reserved.