Severity-based Software Quality Prediction using Class Imbalanced Data

Euy-Seok Hong; Mi-Kyeong Park

doi:10.9708/jksci.2016.21.4.073

ScienceGate Book Chapters

JOURNAL ARTICLE

Severity-based Software Quality Prediction using Class Imbalanced Data

Euy-Seok Hong Mi-Kyeong Park

Year: 2016 Journal: Journal of the Korea Society of Computer and Information Vol: 21 (4)Pages: 73-80 Publisher: Korean Society of Computer Information

DOI: 10.9708/jksci.2016.21.4.073

Get Full-Text PDF Get Analytical Report

Abstract

Most fault prediction models have class imbalance problems because training data usually contains much more non-fault class modules than fault class ones. This imbalanced distribution makes it difficult for the models to learn the minor class module data. Data imbalance is much higher when severity-based fault prediction is used. This is because high severity fault modules is a smaller subset of the fault modules. In this paper, we propose severity-based models to solve these problems using the three sampling methods, Resample, SpreadSubSample and SMOTE. Empirical results show that Resample method has typical over-fit problems, and SpreadSubSample method cannot enhance the prediction performance of the models. Unlike two methods, SMOTE method shows good performance in terms of AUC and FNR values. Especially J48 decision tree model using SMOTE outperforms other prediction models.

Keywords:

C4.5 algorithm Data mining Class (philosophy) Computer science Fault (geology) Decision tree Machine learning Predictive modelling Artificial intelligence Support vector machine Naive Bayes classifier

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.05

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Software Engineering Research

Physical Sciences → Computer Science → Information Systems

Imbalanced Data Classification Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Software Reliability and Analysis Research

Physical Sciences → Computer Science → Software

Severity-based Software Quality Prediction using Class Imbalanced Data

Abstract

Metrics

Citation History

Topics

Related Documents

Software Quality Prediction based on Defect Severity

Tackling the Imbalanced Data in Software Maintainability Prediction Using Ensembles for Class Imbalance Problem

CIL-BSP: Bug Report Severity Prediction based on Class Imbalanced Learning

An intuitionistic fuzzy representation based software bug severity prediction approach for imbalanced severity classes

Improving Heart Disease Severity Prediction Using SMOTE for Imbalanced Data