In order to improve the effect of software defect prediction, many algorithms including feature selection, have been proposed. Based on Wrapper and Filter hybrid framework, a feature-grouping-based feature selection algorithm is proposed in this paper. The algorithm is composed of two steps. In the first step, in order to remove the redundant features, we group the features according to the redundancy between the features. The symmetry uncertainty is used as the constant indicator of the correlation and the FCBF-based grouping algorithm is used to group the features. In the second step, a subset of the features are selected from each group to form the final subset of features. Many classical methods select the representative feature from each group. We consider that when the number of intra-group features is large, the representative features are not enough to reflect the information in this group. Therefore, we require that at least one feature be selected within each group, in this step, the PSO algorithm is used for Searching Randomly from each group. We tested on the open source NASA and PROMISE data sets. Using three kinds of classifier. Compared to the other methods tested in this article, our method resulted in 90% improvement in the predictive performance of 30 sets of results on 10 data sets. Compared with the algorithms without feature selection, the AUC values of this method in the Logistic regression, Naive Bayesian, and K-neighbor classifiers are improved by 5.94% and 4.69% And 8.05%. The FCBF algorithm can also be regarded as a kind of first performing feature grouping. Compared with the FCBF algorithm, the AUC values of this method are improved by 4.78%, 6.41% and 4.4% on the basis of Logistic regression, Naive Bayes and K-neighbor. We can also see that for the FCBF-based grouping algorithm, it could be better to choose a characteristic cloud from each group than to choose a representative one.
Xiaolong XuWen ChenXinheng Wang
Radityo Adi NugrohoFriska AbadiMuhammad FaisalRudy HertenoRahmat Ramadhani
Dyana Rashid IbrahimRawan GhnematAmjad Hudaib