Software Defect Prediction (SDP) is one of the important ways of software quality assurance, which uses the metric data to predict whether software module is defect. The quality of data influences the perfection of the prediction model. The high latitude containing some unnecessary features is one of the quality problem that dataset. To solve this problem, we proposed a hybrid feature selection (HFS) method combined different feature sorting technology. Firstly, we calculate the values of each feature include chi-squared (cs), Information gain (IG) and Pearson Correlation coefficient, respectively. Secondly, we sort the features based on the ranking of the three values to select features. Finally, we use the random forest to build the model. In order to validity the approach, we did experiments on 5 datasets in NASA. The result shows that our approach can select a smaller subset of features to improve the preformation in F-measure.
N. GayatriS. NickolasAnusuyah SubbaraoT KhoshgoftaarL BullardK GaoS LessmannB BaesensC MuesS PietschM MeulenM RevillaD RodriguezR RuizJ Cuadrado-GallegoJ Aguilar-RuizK SunghunT ZimmermannE WhiteheadA ZellerS PfleegerC OoiH ChettyM TengSJohn KohaviR PflegerKG FormanC SerafiniM MerlerS JurmanGI GuyonA ElisseeffS DoraisamyS GolzariN NorowiN SulaimanN UdzirM A HallG HolmesG IlczukR MlynarskiW KargulWakulicz-DejaD RodriguezR RuizJ Cuadrado-GallegoJ Aguilar-RuizM GarreZ ChenT MenziesD PortB BoehmN PizziA DemkoW PedryczK JongE MarchioriM SebagVan Der VaartK GaoT KhoshgoftaarH WangN SeliyaXinwang LiuGuomin ZhangYubin ZhanEn ZhuH LiuL YuMarko Robnik-SikonjaIgor KononenkoI GuyonJ WestonS BarnhillV VapnikD AhaD KiblerM AlbertH JohnP LangleyP DomingosM PazzaniLe CessieS Van HouwelingenJY MaB CukicT KhoshgoftaarM GolawalaJ Van Hulse
Xiao YuZiyi MaChuanxiang MaYi GuRuiqi LiuYan Zhang
Muhammad Yoga Adha PratamaRudy HertenoMohammad Reza FaisalRadityo Adi NugrohoFriska Abadi