JOURNAL ARTICLE

Using active learning selection approach for cross-project software defect prediction

Wenbo MiYong LiMing WenYouren Chen

Year: 2022 Journal:   Connection Science Vol: 34 (1)Pages: 1482-1499   Publisher: Taylor & Francis

Abstract

Cross-project defect prediction (CPDP) technology can effectively ensure software quality, which plays an important role in software engineering. When encountering a newly developed project with insufficient training data, CPDP can be used to build defect predictors using other projects. However, CPDP does not take into account the prior knowledge of the target items and the class imbalance in the source item data. In this paper, we design an active learning selection algorithm for cross-project defect prediction to alleviate the above problems. First, we use clustering and active learning algorithms to filter and label some representative data from the target items and use these data as prior knowledge to guide the selection of source items. Then, the active learning algorithm is used to filter representative data from the source items. Finally, the balanced cross-item dataset is constructed using the active learning algorithm, and the defect prediction model is built. In this article, we selected 10 open-source projects by using common defect prediction models, active learning algorithms, and common evaluation metrics. The results show that the proposed algorithm can effectively filter the data, solve the class imbalance problem in cross-project data, and improve the defect prediction performance.

Keywords:
Computer science Machine learning Artificial intelligence Data mining Cluster analysis Filter (signal processing) Software Selection (genetic algorithm) Active learning (machine learning) Class (philosophy)

Metrics

7
Cited By
2.66
FWCI (Field Weighted Citation Impact)
28
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
Software Engineering Techniques and Practices
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.