JOURNAL ARTICLE

COMPARATIVE ANALYSIS OF CLUSTER CONCENTRIC CIRCLE BASED UNDER SAMPLING OVER LOW VERSUS HIGH DIMENSIONAL IMBALANCED DATASETS

S. Srividhya

Year: 2017 Journal:   International Journal of Advanced Research in Computer Science Vol: 8 (8)Pages: 433-437   Publisher: International Journal of Advanced Research in Computer Science

Abstract

An imbalanced dataset influences the supervised learning model. Most of the existing real world datasets are imbalanced and often high dimensional. The existing classification methods tend to perform extremely well on the majority class and give least importance to the minority class. Most of the solutions provided for the imbalanced datasets do not fit in for the high dimensional imbalanced datasets. This paper compares the performance of an existing balancing method (cluster concentric circle based under sampling-C3BUS) over low dimensional imbalanced dataset versus high dimensional imbalanced datasets. This work shows that C3BUS works quiet well for low dimensional imbalanced dataset when compared to high dimensional imbalanced dataset and proves that class imbalance and high dimensionality are one of the two main issues in supervised learning process.

Keywords:
Computer science High dimensional Artificial intelligence Machine learning Oversampling Sampling (signal processing) Cluster (spacecraft) Curse of dimensionality Class (philosophy) Data mining Cluster analysis Pattern recognition (psychology) Process (computing)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
14
Refs
0.14
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Machine Learning and Data Classification
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.