Modified Adaptive Synthetic SMOTE to Improve Classification Performance in Imbalanced Datasets

Hazel A. Gameng; Bobby B. Gerardo; Ruji P. Medina

doi:10.1109/icetas48360.2019.9117287

ScienceGate Book Chapters

JOURNAL ARTICLE

Modified Adaptive Synthetic SMOTE to Improve Classification Performance in Imbalanced Datasets

Hazel A. Gameng Bobby B. Gerardo Ruji P. Medina

Year: 2019 Journal: 2019 IEEE 6th International Conference on Engineering Technologies and Applied Sciences (ICETAS) Pages: 1-5

DOI: 10.1109/icetas48360.2019.9117287

Get Full-Text PDF Get Analytical Report

Abstract

The oversampling technique in the data preprocessing has been utilized to mitigate the imbalanced data problem in the real research scenario. This imbalance may reduce the ability of classification algorithms to recognize cases of interest leading to misclassification of positive samples as negative class or the false positive generation. Synthetic Minority Oversampling Technique (SMOTE) is one of the oversampling techniques existing and the Adaptive Synthetic (Adasyn) SMOTE is one of its many variants. K-Nearest Neighbor (KNN) is incorporated in Adasyn. In this study, Manhattan distance is applied in the KNN computations. This modified Adasyn was evaluated in terms of its effectiveness in the performance measure of overall accuracy, precision, recall and F1 measure on the six imbalanced datasets using logistic regression as the classification algorithm. The modified Adasyn dominated over SMOTE and the original Adasyn by 66.67 percent of the total performance metric count. It leads the accuracy and recall count with 4 out of 6, precision count with 3 out of 6, and the F1 measure count with 5 over 6. Thus, proving that the modified Adasyn can provide an efficient solution in decreasing misclassification on imbalanced datasets.

Keywords:

Computer science Artificial intelligence Machine learning Pattern recognition (psychology) Data mining

Metrics

Cited By

0.66

FWCI (Field Weighted Citation Impact)

Refs

0.77

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Machine Learning and Data Classification

Physical Sciences → Computer Science → Artificial Intelligence

Artificial Intelligence in Healthcare

Health Sciences → Health Professions → Health Information Management

Modified Adaptive Synthetic SMOTE to Improve Classification Performance in Imbalanced Datasets

Abstract

Metrics

Citation History

Topics

Related Documents

A modified adaptive synthetic sampling method for learning imbalanced datasets

Kernel-based SMOTE for SVM classification of imbalanced datasets

An extended SMOTE-based class balancing technique to improve classification accuracy on imbalanced datasets

To improve classification of imbalanced datasets

A Modified Borderline Smote with Noise Reduction in Imbalanced Datasets