JOURNAL ARTICLE

Software Defect Prediction Based on Cost-Sensitive Dictionary Learning

Hongyan WanGuoqing WuMali YuMengting Yuan

Year: 2019 Journal:   International Journal of Software Engineering and Knowledge Engineering Vol: 29 (09)Pages: 1219-1243   Publisher: World Scientific

Abstract

Software defect prediction technology has been widely used in improving the quality of software system. Most real software defect datasets tend to have fewer defective modules than defective-free modules. Highly class-imbalanced data typically make accurate predictions difficult. The imbalanced nature of software defect datasets makes the prediction model classifying a defective module as a defective-free one easily. As there exists the similarity during the different software modules, one module can be represented by the sparse representation coefficients over the pre-defined dictionary which consists of historical software defect datasets. In this study, we make use of dictionary learning method to predict software defect. We optimize the classifier parameters and the dictionary atoms iteratively, to ensure that the extracted features (sparse representation) are optimal for the trained classifier. We prove the optimal condition of the elastic net which is used to solve the sparse coding coefficients and the regularity of the elastic net solution. Due to the reason that the misclassification of defective modules generally incurs much higher cost risk than the misclassification of defective-free ones, we take the different misclassification costs into account, increasing the punishment on misclassification defective modules in the procedure of dictionary learning, making the classification inclining to classify a module as a defective one. Thus, we propose a cost-sensitive software defect prediction method using dictionary learning (CSDL). Experimental results on the 10 class-imbalance datasets of NASA show that our method is more effective than several typical state-of-the-art defect prediction methods.

Keywords:
Computer science Classifier (UML) Software bug Software Artificial intelligence Machine learning Data mining Sparse approximation Pattern recognition (psychology) Programming language

Metrics

7
Cited By
2.55
FWCI (Field Weighted Citation Impact)
56
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software

Related Documents

JOURNAL ARTICLE

Cost-sensitive Dictionary Learning for Software Defect Prediction

Liang NiuJianwu WanHongyuan WangKaiwei Zhou

Journal:   Neural Processing Letters Year: 2020 Vol: 52 (3)Pages: 2415-2449
JOURNAL ARTICLE

Two-Stage Cost-Sensitive Learning for Software Defect Prediction

Mingxia LiuLinsong MiaoDaoqiang Zhang

Journal:   IEEE Transactions on Reliability Year: 2014 Vol: 63 (2)Pages: 676-686
JOURNAL ARTICLE

Software Defect Prediction Using Dictionary Learning

Hongyan WanGuoqing WuMing ChengQing HuangRui WangMengting Yuan

Journal:   Proceedings/Proceedings of the ... International Conference on Software Engineering and Knowledge Engineering Year: 2017 Vol: 2017 Pages: 335-340
© 2026 ScienceGate Book Chapters — All rights reserved.