JOURNAL ARTICLE

Heterogeneous Defect Prediction Through Multiple Kernel Learning and Ensemble Learning

Abstract

Heterogeneous defect prediction (HDP) aims to predict defect-prone software modules in one project using heterogeneous data collected from other projects. Recently, several HDP methods have been proposed. However, these methods do not sufficiently incorporate the two characteristics of the defect prediction data: (1) data could be linearly inseparable, and (2) data could be highly imbalanced. These two data characteristics make it challenging to build an effective HDP model. In this paper, we propose a novel Ensemble Multiple Kernel Correlation Alignment (EMKCA) based approach to HDP, which takes into consideration the two characteristics of the defect prediction data. Specifically, we first map the source and target project data into high dimensional kernel space through multiple kernel leaning, where the defective and non-defective modules can be better separated. Then, we design a kernel correlation alignment method to make the data distribution of the source and target projects similar in the kernel space. Finally, we integrate multiple kernel classifiers with ensemble learning to relieve the influence caused by class imbalance problem, which can improve the accuracy of the defect prediction model. Consequently, EMKCA owns the advantages of both multiple kernel learning and ensemble learning. Extensive experiments on 30 public projects show that EMKCA outperforms the related competing methods.

Keywords:
Kernel (algebra) Computer science Machine learning Ensemble learning Artificial intelligence Tree kernel Multiple kernel learning Radial basis function kernel Kernel embedding of distributions Kernel method Correlation Software Data mining Support vector machine Mathematics

Metrics

48
Cited By
11.20
FWCI (Field Weighted Citation Impact)
77
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

JOURNAL ARTICLE

Multiple kernel ensemble learning for software defect prediction

Tiejian WangZhiwu ZhangXiao‐Yuan JingLiqiang Zhang

Journal:   Automated Software Engineering Year: 2015 Vol: 23 (4)Pages: 569-590
BOOK-CHAPTER

Heterogeneous Defect Prediction Using Ensemble Learning Technique

Arsalan Ahmed AnsariAmaan IqbalBibhudatta Sahoo

Advances in intelligent systems and computing Year: 2020 Pages: 283-293
JOURNAL ARTICLE

Heterogeneous Defect Prediction through Correlation-Based Selection of Multiple Source Projects and Ensemble Learning

Eunseob KimJongmoon BaikDuksan Ryu

Journal:   2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS) Year: 2021 Pages: 503-513
JOURNAL ARTICLE

Heterogeneous defect prediction with two-stage ensemble learning

Zhiqiang LiXiao‐Yuan JingXiaoke ZhuHongyu ZhangBaowen XuShi Ying

Journal:   Automated Software Engineering Year: 2019 Vol: 26 (3)Pages: 599-651
JOURNAL ARTICLE

Kernel Spectral Embedding Transfer Ensemble for Heterogeneous Defect Prediction

Haonan TongBin LiuShihai Wang

Journal:   IEEE Transactions on Software Engineering Year: 2019 Pages: 1-1
© 2026 ScienceGate Book Chapters — All rights reserved.