Majid MohammadiDario Di NucciDamian A. Tamburri
Machine learning is widely used to predict software defect-prone components, facilitating testing and improving application quality. In a recent meta-analysis on binary classification for software defect prediction, the so-called researcher bias -i.e., the group who conducts the study- has been shown to play a critical role; the analysis, however, featured using proper null hypothesis testing statistical analysis alone. Since the null hypothesis testing is based on the so-called p-value, which is not the desired likelihood of the null hypothesis, it suffers from several important drawbacks. This article presents a Bayesian analysis of the same dataset, which overcomes the pitfalls of the null hypothesis testing approach and relaxes the assumptions of the methods used in the previous study. While the Bayesian analysis in this article identifies the software metrics as the most influential factor for a classifier's performance, researcher bias is still the second most important factor: the precautions against researcher bias are still critical to consider in the scope of software defect prediction endeavors. Further on, to confirm this finding, we analyze the data with more advanced Bayesian modeling, according to which we identify (1) classifiers with better performance, (2) the datasets whose instances are harder to predict, and (3) the metrics that impact the performance of a classifier.
Aimen KhalidGran BadshahNasir AyubMuhammad ShirazMohamed Ghouse
Prasanth YallaPasam MeghanaR. SravanthiVenkata Naresh Mandhala
Praman Deep SinghAnuradha Chug