JOURNAL ARTICLE

Heterogeneous Fault Prediction Using Feature Selection and Supervised Learning Algorithms

Rashmi AroraArvinder Kaur

Year: 2022 Journal:   Vietnam Journal of Computer Science Vol: 09 (03)Pages: 261-284   Publisher: World Scientific

Abstract

Software Fault Prediction (SFP) is the most persuasive research area of software engineering. Software Fault Prediction which is carried out within the same software project is known as With-In Fault Prediction. However, local data repositories are not enough to build the model of With-in software Fault prediction. The idea of cross-project fault prediction (CPFP) has been suggested in recent years, which aims to construct a prediction model on one project, and use that model to predict the other project. However, CPFP requires that both the training and testing datasets use the same set of metrics. As a consequence, traditional CPFP approaches are challenging to implement through projects with diverse metric sets. The specific case of CPFP is Heterogeneous Fault Prediction (HFP), which allows the program to predict faults among projects with diverse metrics. The proposed framework aims to achieve an HFP model by implementing Feature Selection on both the source and target datasets to build an efficient prediction model using supervised machine learning techniques. Our approach is applied on two open-source projects, Linux and MySQL, and prediction is evaluated based on Area Under Curve (AUC) performance measure. The key results of the proposed approach are as follows: It significantly gives better results of prediction performance for heterogeneous projects as compared with cross projects. Also, it demonstrates that feature selection with feature mapping has a significant effect on HFP models. Non-parametric statistical analyses, such as the Friedman and Nemenyi Post-hoc Tests, are applied, demonstrating that Logistic Regression performed significantly better than other supervised learning algorithms in HFP models.

Keywords:
Computer science Machine learning Data mining Feature selection Software Predictive modelling Artificial intelligence Software metric Metric (unit) Fault (geology) Feature (linguistics) Feature engineering Software system Software construction Deep learning Engineering

Metrics

10
Cited By
3.80
FWCI (Field Weighted Citation Impact)
28
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

JOURNAL ARTICLE

Towards Effective Service Discovery using Feature Selection and Supervised Learning Algorithms

Heyam H. Al-BaityI. Norah

Journal:   International Journal of Advanced Computer Science and Applications Year: 2019 Vol: 10 (5)
JOURNAL ARTICLE

Software Reliability Prediction Using Deep Learning and Feature Selection Algorithms

Shahbaa I. KhaleeLumia Faiz Salih

Journal:   International Research Journal of Innovations in Engineering and Technology Year: 2024 Vol: 08 (02)Pages: 08-18
© 2026 ScienceGate Book Chapters — All rights reserved.