JOURNAL ARTICLE

Cross-Project Defect Prediction Method based on Feature Distribution Alignment and Neighborhood Instance Selection

Yi Zhu Yi ZhuYu Zhao Yi ZhuQiao Yu Yu ZhaoXiaoying Chen Qiao Yu

Year: 2022 Journal:   網際網路技術學刊 Vol: 23 (4)Pages: 761-769   Publisher: Taiwan Academic Network

Abstract

<p>In the practice of software project development, the developed project is a brand-new project. Defect prediction for this type of software project requires the use of other similar projects (i.e. source projects) to collect relevant data to build a defect prediction model, and make defect prediction for the project under development (i.e. target project). However, the prediction model built with the relevant data of the source project cannot achieve the ideal prediction performance when predicting the target project. The main reason is that there is a large data distribution difference between the source project and the target project. The data distribution difference is mainly in the distribution of features between projects and differences between instances. In response to the above problems, starting from both features and instances, a cross-project defect prediction method is proposed. This method first aligns the feature distribution based on the data of the existing target project and the source project data. Then, it selects the labeled instance that is similar to the unlabeled instance in the target project, and finally builds a defect prediction model based on the selected source project instances. Cross-project defect prediction experiments were carried out on the Relink datasets and the Promise datasets. Compared with the classic instance-based cross-project defect prediction method, significant improvements have been made in F-measure and AUC; compared with the prediction of within project defect prediction, it has achieved comparable performance.</p> <p>&nbsp;</p>

Keywords:
Computer science Predictive modelling Data mining Software Feature (linguistics) Project management Feature selection Software project management Machine learning Software development Systems engineering

Metrics

1
Cited By
0.38
FWCI (Field Weighted Citation Impact)
38
Refs
0.60
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Engineering Techniques and Practices
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.