JOURNAL ARTICLE

Distributed Synthetic Minority Oversampling Technique

Sakshi HoodaSuman Mann

Year: 2019 Journal:   International Journal of Computational Intelligence Systems Vol: 12 (2)Pages: 929-929   Publisher: Springer Nature

Abstract

Real world problems for prediction usually try to predict rare occurrences. Application of standard classification algorithm is biased toward against these rare events, due to this data imbalance. Typical approaches to solve this data imbalance involve oversampling these “rare events” or under sampling the majority occurring events. Synthetic Minority Oversampling Technique is one technique that addresses this class imbalance effectively. However, the existing implementations of SMOTE fail when data grows and can't be stored on a single machine. In this paper present our solution to address the “big data challenge.” We provide a distributed version of SMOTE by using scalable k-means++ and M-Trees. With this implementation of SMOTE, we were able to oversample the “rare events” and achieve results which are better than the existing python version of SMOTE.

Keywords:
Oversampling Computer science Pattern recognition (psychology) Artificial intelligence Telecommunications Bandwidth (computing)

Metrics

10
Cited By
0.82
FWCI (Field Weighted Citation Impact)
0
Refs
0.73
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Algorithms and Applications
Physical Sciences →  Engineering →  Control and Systems Engineering
Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.