This paper develops a new algorithm which selects the appropriate feature set and overcomes the challenges with microarray data. Initially, it balances the dataset and then combines the importance score obtained from random forest and mutual information to develop the new technique. Adding importance score of each feature along with mutual information is qualitative mutual information (QMI). In this study, an experiment has been performed which compares the final features reduced from the algorithm with the feature subset found using importance score obtained from random forest and proposed QMI approach. The comparison has been made in terms of number of features selected and classification accuracy from three different classifiers, Naïve Bayes, C4.5 and IB1. The results depict that the proposed algorithm effectively reduces the features and improves the classification. The experiment also proves that combing importance score from random forest with mutual information is more effective than applying them individually.
Zhongxin WangGang SunJing ZhangJia Zhao
Atiyeh MortazaviMohammad Hossein Moattar
Nirmalya BandyopadhyayTamer KahveciSteve GoodisonYijun SunSanjay Ranka