Bayesian Meta-Analysis of Software Defect Prediction With Machine Learning

Majid Mohammadi; Dario Di Nucci; Damian A. Tamburri

doi:10.1109/ticps.2023.3306723

ScienceGate Book Chapters

JOURNAL ARTICLE

Bayesian Meta-Analysis of Software Defect Prediction With Machine Learning

Majid Mohammadi Dario Di Nucci Damian A. Tamburri

Year: 2023 Journal: IEEE Transactions on Industrial Cyber-Physical Systems Vol: 1 Pages: 147-156 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/ticps.2023.3306723

Get Full-Text PDF Get Analytical Report

Abstract

Machine learning is widely used to predict software defect-prone components, facilitating testing and improving application quality. In a recent meta-analysis on binary classification for software defect prediction, the so-called researcher bias -i.e., the group who conducts the study- has been shown to play a critical role; the analysis, however, featured using proper null hypothesis testing statistical analysis alone. Since the null hypothesis testing is based on the so-called p-value, which is not the desired likelihood of the null hypothesis, it suffers from several important drawbacks. This article presents a Bayesian analysis of the same dataset, which overcomes the pitfalls of the null hypothesis testing approach and relaxes the assumptions of the methods used in the previous study. While the Bayesian analysis in this article identifies the software metrics as the most influential factor for a classifier's performance, researcher bias is still the second most important factor: the precautions against researcher bias are still critical to consider in the scope of software defect prediction endeavors. Further on, to confirm this finding, we analyze the data with more advanced Bayesian modeling, according to which we identify (1) classifiers with better performance, (2) the datasets whose instances are harder to predict, and (3) the metrics that impact the performance of a classifier.

Keywords:

Computer science Machine learning Artificial intelligence Bayesian probability Software quality Null hypothesis Statistical hypothesis testing Software Naive Bayes classifier Data mining Classifier (UML) Binary classification Null (SQL) Bayes factor Bayesian inference Software development Statistics Mathematics Support vector machine

Metrics

Cited By

1.86

FWCI (Field Weighted Citation Impact)

Refs

0.86

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Software Engineering Research

Physical Sciences → Computer Science → Information Systems

Software Reliability and Analysis Research

Physical Sciences → Computer Science → Software

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Bayesian Meta-Analysis of Software Defect Prediction With Machine Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Software Defect Prediction Analysis Using Machine Learning Techniques

Machine Learning Algorithms in Software Defect Prediction Analysis

Software defect prediction analysis using machine learning algorithms

Machine-Learning-Assisted Software Defect Prediction

Software Defect Prediction Using Machine Learning