Comparing Supervised Machine Learning Algorithms on Classification Efficiency of multiclass classifications problem

Workineh Menna; Chengcai Leng; Aklilu Kurika; Anup Basu; Ae; Mohamed; J Zhu; Y Liang; C Liu; X Luan; K Leung; T Chan; Z Xu; H Zhang; S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J Mesirov; T Poggio; B Van Calster; K Van Hoorde; Y Vergouwe; S Bobdiwala; G Condous; E Kirk; T Bourne; E Steyerberg; H Park; R Caruana; A Niculescu-Mizil; R King; C Feng; A Shutherland; S Uddin; A Khan; M Hossain; M Moni; M Zeki-Suac; S Pfeifer; N arlija; J Friedman; T Obuchi; Y Kabashima; B Boser; I Guyon; F Lauer; Y Guermeur; A Smola; B Schlkopf; V Vapnik; F Chung; W Shitong; D Zhaohong; H Dewen; H Lodhi; J Shawe-Taylor; N Christianini; C Watkins; C Hsu; C Lin; M Awad; R Khanna; T Hastie; R Tibshirani; Y Lee; Y Lin; G Wahba; A El-Habil; K Lee; H Ahn; H Moon; R Kodell; J Chen; S Cessie; J Van Houwelingen; G Kaur; A Chhabra; R Holte; R Duin; L Breiman; Y Amit; D Geman; D Cutler; T Edwards; K Beard; A Cutler; K Hess; J Gibson; J Lawler; B Ghimire; J Rogan; J Miller; M Seera; C Lim; J Titapiccolo; M Ferrario; S Cerutti; C Barbieri; F Mari; E Gatti; M Signorini; S Ali; K Smith; R Kohavi; I Witten; E Frank; D Simon; J Boring; S Visa; B Ramsay; A Ralescu; E Vanderknaap; J Hanley; B Mcneil; S Narkhede; J Cohen; A Ben-David; M Mchugh; S Sun; S Vieira; U Kaymak; J Sousa; A Viera; J Garrett

doi:10.30534/ijeter/2022/081062022

JOURNAL ARTICLE

Comparing Supervised Machine Learning Algorithms on Classification Efficiency of multiclass classifications problem

Workineh Menna Chengcai Leng Aklilu Kurika Anup Basu Ae Mohamed J Zhu T Hastie Y Liang C Liu X Luan K Leung T Chan Z Xu H Zhang S Ramaswamy P Tamayo R Rifkin S Mukherjee C Yeang M Angelo C Ladd M Reich E Latulippe J Mesirov T Poggio B Van Calster K Van Hoorde Y Vergouwe S Bobdiwala G Condous E Kirk T Bourne E Steyerberg H Park R Caruana A Niculescu-Mizil R King C Feng A Shutherland S Uddin A Khan M Hossain M Moni M Zeki-Suac S Pfeifer N arlija J Friedman T Hastie R Tibshirani T Obuchi Y Kabashima B Boser I Guyon V Vapnik F Lauer Y Guermeur A Smola B Schlkopf V Vapnik F Chung W Shitong D Zhaohong H Dewen H Lodhi J Shawe-Taylor N Christianini C Watkins C Hsu C Lin M Awad R Khanna T Hastie R Tibshirani Y Lee Y Lin G Wahba A El-Habil K Lee H Ahn H Moon R Kodell J Chen S Cessie J Van Houwelingen G Kaur A Chhabra R Holte R Duin L Breiman Y Amit D Geman D Cutler T Edwards K Beard A Cutler K Hess J Gibson J Lawler B Ghimire J Rogan J Miller M Seera C Lim J Titapiccolo M Ferrario S Cerutti C Barbieri F Mari E Gatti M Signorini S Ali K Smith R Kohavi I Witten E Frank D Simon J Boring S Visa B Ramsay A Ralescu E Vanderknaap J Hanley B Mcneil S Narkhede J Cohen A Ben-David M Mchugh S Sun S Vieira U Kaymak J Sousa A Viera J Garrett

Year: 2022 Journal: International Journal of Emerging Trends in Engineering Research Vol: 10 (6)Pages: 346-360

DOI: 10.30534/ijeter/2022/081062022

Get Full-Text PDF Get Analytical Report

Abstract

Multi-class classification is a fascinating field to study. However, evaluating the classification performance of classifiers is difficult. Class indices such as accuracy, precision, recall, and F-measure, Kappa and area under the curve of receiver operating characteristics (AUC), can be used to evaluate classification performance. These indices describe the classification results achieved on each modelled class. Several measures have been introduced in the literature to deal with this assessment, the most commonly used being accuracy. In general these metrics were proposed to address binary classification tasks, whereas multiclass classification is the more difficult and currently active research area in machine learning (ML). In this paper, we intended to compare classification performance of nine supervised machine learning algorithms based on three learner types: statistical learner, rule-based learner and neural-base learner by considering accuracy, precision, recall and F-measure and ROC area achieved on four different datasets from UCI machine repository. Among these, Random forest has been the best performance in both 10 fold cross validation and percentage split with overall average accuracy of predictive power of 92.20% and 91.76% respectively, with less variability, whereas Naïve Bayes has the worst also in both 10 fold cross validation and percentage split by average correct classification performance of 79.18% and 76.92% respectively, and also with higher variability next to Decision Table.

Keywords:

Artificial intelligence Machine learning Naive Bayes classifier Computer science Binary classification Random forest Receiver operating characteristic Multiclass classification Precision and recall Cross-validation Statistical classification Class (philosophy) Support vector machine Data mining

Metrics

Cited By

0.20

FWCI (Field Weighted Citation Impact)

Refs

0.51

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Artificial Intelligence in Healthcare

Health Sciences → Health Professions → Health Information Management

Comparing Supervised Machine Learning Algorithms on Classification Efficiency of multiclass classifications problem

Abstract

Metrics

Citation History

Topics

Related Documents

Comparing Supervised Classification Algorithms in Machine Learning for Poverty Prediction

Comparing the Performance of Machine Learning Algorithms for Multiclass Network Traffic Classification

Probabilistic machine learning on multiclass classification problem

Predicting Happiness - Comparison of Supervised Machine Learning Techniques Performance on a Multiclass Classification Problem

Protostellar classification using supervised machine learning algorithms