Workineh MennaChengcai LengAklilu KurikaAnup BasuAeMohamedJ ZhuT HastieY LiangC LiuX LuanK LeungT ChanZ XuH ZhangS RamaswamyP TamayoR RifkinS MukherjeeC YeangM AngeloC LaddM ReichE LatulippeJ MesirovT PoggioB Van CalsterK Van HoordeY VergouweS BobdiwalaG CondousE KirkT BourneE SteyerbergH ParkR CaruanaA Niculescu-MizilR KingC FengA ShutherlandS UddinA KhanM HossainM MoniM Zeki-SuacS PfeiferN arlijaJ FriedmanT HastieR TibshiraniT ObuchiY KabashimaB BoserI GuyonV VapnikF LauerY GuermeurA SmolaB SchlkopfV VapnikF ChungW ShitongD ZhaohongH DewenH LodhiJ Shawe-TaylorN ChristianiniC WatkinsC HsuC LinM AwadR KhannaT HastieR TibshiraniY LeeY LinG WahbaA El-HabilK LeeH AhnH MoonR KodellJ ChenS CessieJ Van HouwelingenG KaurA ChhabraR HolteR DuinL BreimanY AmitD GemanD CutlerT EdwardsK BeardA CutlerK HessJ GibsonJ LawlerB GhimireJ RoganJ MillerM SeeraC LimJ TitapiccoloM FerrarioS CeruttiC BarbieriF MariE GattiM SignoriniS AliK SmithR KohaviI WittenE FrankD SimonJ BoringS VisaB RamsayA RalescuE VanderknaapJ HanleyB McneilS NarkhedeJ CohenA Ben-DavidM MchughS SunS VieiraU KaymakJ SousaA VieraJ Garrett
Multi-class classification is a fascinating field to study. However, evaluating the classification performance of classifiers is difficult. Class indices such as accuracy, precision, recall, and F-measure, Kappa and area under the curve of receiver operating characteristics (AUC), can be used to evaluate classification performance. These indices describe the classification results achieved on each modelled class. Several measures have been introduced in the literature to deal with this assessment, the most commonly used being accuracy. In general these metrics were proposed to address binary classification tasks, whereas multiclass classification is the more difficult and currently active research area in machine learning (ML). In this paper, we intended to compare classification performance of nine supervised machine learning algorithms based on three learner types: statistical learner, rule-based learner and neural-base learner by considering accuracy, precision, recall and F-measure and ROC area achieved on four different datasets from UCI machine repository. Among these, Random forest has been the best performance in both 10 fold cross validation and percentage split with overall average accuracy of predictive power of 92.20% and 91.76% respectively, with less variability, whereas Naïve Bayes has the worst also in both 10 fold cross validation and percentage split by average correct classification performance of 79.18% and 76.92% respectively, and also with higher variability next to Decision Table.
Yassine El aachabYouness JouililMohammed Kaicer
NGUYEN, DANAhmadi, PouyanIslam, Khondkar