The classification of real-world problems always consists of imbalanced and multiclass datasets.A dataset having unbalanced and multiple classes will have an impact on the pattern of the classification model and the classification accuracy, which will be decreased.Hence, oversampling method keeps the class of dataset balanced and avoids the overfitting problem.The purposes of the study were to handle multiclass imbalanced datasets and to improve the effectiveness of the classification model.This study proposed a hybrid method by combining the Synthetic Minority Oversampling Technique (SMOTE) and One-Versus-All (OVA) with deep learning and ensemble classifiers; stacking and random forest algorithms for multiclass imbalanced data handling.Datasets consisting of different numbers of classes and imbalances are gained from the UCI Machine Learning Repository.The research outputs illustrated that the presented method acquired the best accuracy value at 98.51% when the deep learning classifier was used to evaluate model classification performance in the new-thyroid dataset.The proposed method using the stacking algorithm received a higher accuracy rate than other methods in the car, pageblocks, and Ecoli datasets.In addition, the outputs gained the highest performance of classification at 98.47% in the dermatology dataset where the random forest is used as a classifier.
Mohammad Zoynul AbedinGuotai ChiPetr HájekTong Zhang
Mohd Shamrie SaininRayner Alfred