Software defect prediction (SDP) can help software developers and quality assurance personnel to effectively predict software fault proneness. Recently, researchers have proposed a lot of methods to improve the predicting results, especially under a within-project defect prediction (WPDP) setting. However, cross-project defect prediction (CPDP) is difficult because of the data distribution difference between source and target projects. Transfer learning model has been proven that it can effectively reduce the data distribution difference. By the intuition, if the better source is selected, we can get better prediction performance based on transfer learning model. In this paper, we conducted an empirical study on source selection including feature selection and source project selection for CPDP, and then combined source selection with popular transfer learning model TCA+ in CPDP. Finally, the result shows that the combining technique MZTCA+ can effectively improve the state-of-the art CPDP models, such as TCA+, LT, Dycom, TDS.
Wanzhi WenNingbo ZhuBingqing YeXikai LiChuyue WangJiawei ChuYuehua Li
Chuyue WangJiawei ChuYuehua LiXikai LiWanzhi WenNingbo ZhuBingqing Ye
A. AnjaliPhilip SamuelSumam Mary Idicula
Xuanying LiuZonghao LiJiaqi ZouHaonan Tong
Tianwei LeiJingfeng XueWeijie Han