LI Li, REN Zhenkang, SHI Kexin
Software defect prediction can effectively improve the reliability of software and remedy the loopholes in a system.Boosting resampling is a common method for solving the problem of insufficient software defect prediction samples.However, the conventional Boosting method is ineffective in solving the problem of domain class imbalance. Therefore, a cost sensitive Boosting software defect prediction method named CSBst is proposed in this study. Considering the different costs of missing data and false positives in the defect module, the cost sensitive Boosting method is used to update and increase the sample weight of the first error type.This ensures that the updated weight is greater than the weight of the flawless sample and the second error type sample, which improves the prediction rate of the module.The threshold moving method is used to integrate the classification results of multiple decision tree-based classifiers to solve the over fitting problem.Subsequently, the optimal weight and threshold values in the model construction process are determined analytically.Experiments on NASA software defect prediction dataset demonstrate that with small samples, compared to CSBKNN and CSCE methods, the BAL prediction index of CSBst method is 7% and 3% higher, respectively.Moreover, the time complexity is reduced by one order of magnitude.
Ling XuWang BeiLing LiuMo ZhouShengping LiaoMeng Yan
Weidong ZhaoShengdong ZhangMing Wang
Liang NiuJianwu WanHongyuan WangKaiwei Zhou