JOURNAL ARTICLE

Clustering-based improved adaptive synthetic minority oversampling technique for imbalanced data classification

Dian JinDehong XieDi LiuMurong Gong

Year: 2023 Journal:   Intelligent Data Analysis Vol: 27 (3)Pages: 635-652   Publisher: IOS Press

Abstract

Synthetic Minority Oversampling Technique (SMOTE) and some extensions based on it are popularly used to balance imbalanced data. In this study, we concentrate on solving overfitting of the classification model caused by choosing instances to oversample that increase the occurrence of overlaps with the majority class. Our method called Clustering-based Improved Adaptive Synthetic Minority Oversampling Technique (CI-ASMOTE1) decomposes minority instances into sub-clusters according to their connectivity in the feature space and then selects minority sub-clusters which are relatively close to the decision boundary as the candidate regions to oversample. After application of CI-ASMOTE1, new minority instances are only synthesized within each connected region of the selected sub-clusters. Considering the diversity of the synthetic instances in each selected sub-cluster, CI-ASMOTE2 is put forward to extend CI-ASMOTE1 by keeping all features of those instances in the feature space as different as possible. The experimental evaluation shows that CI-ASMOTE1 and CI-ASMOTE2 improve SMOTE and its extensions, especially in the occurrence of overlaps between the minority instances and the majority instances.

Keywords:
Oversampling Overfitting Cluster analysis Pattern recognition (psychology) Computer science Artificial intelligence Feature (linguistics) Synthetic data Decision boundary Cluster (spacecraft) Data mining Machine learning Support vector machine Artificial neural network Bandwidth (computing)

Metrics

2
Cited By
0.51
FWCI (Field Weighted Citation Impact)
36
Refs
0.64
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Electricity Theft Detection Techniques
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.