JOURNAL ARTICLE

Over-sampling algorithm for imbalanced data classification

Xiaolong XuWen ChenYanfei Sun

Year: 2019 Journal:   Journal of Systems Engineering and Electronics Vol: 30 (6)Pages: 1182-1191   Publisher: Institute of Electrical and Electronics Engineers

Abstract

For imbalanced datasets, the focus of classification is to identify samples of the minority class.The performance of current data mining algorithms is not good enough for processing imbalanced datasets.The synthetic minority over-sampling technique (SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby.However, the SMOTE encounters the overgeneralization problem.The densitybased spatial clustering of applications with noise (DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable.This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique (DSMOTE).First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples.In order to make full use of the information of core samples and borderline samples, different strategies are used to over-sample core samples and borderline samples.Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.

Keywords:
Computer science Sampling (signal processing) Artificial intelligence Statistical classification Pattern recognition (psychology) Data mining Machine learning Algorithm Computer vision

Metrics

103
Cited By
6.30
FWCI (Field Weighted Citation Impact)
37
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

DPC-SMOTE Over-sampling Algorithm for Imbalanced Data Classification

LIU ZhihanZHANG ZhonglinZHAO Lei

Journal:   DOAJ (DOAJ: Directory of Open Access Journals) Year: 2024
JOURNAL ARTICLE

Borderline over-sampling for imbalanced data classification

Hien M. NguyenEric W. CooperKatsuari Kamei

Journal:   International Journal of Knowledge Engineering and Soft Data Paradigms Year: 2011 Vol: 3 (1)Pages: 4-4
© 2026 ScienceGate Book Chapters — All rights reserved.