JOURNAL ARTICLE

Optimized multi correlation-based feature selection in software defect prediction

Muhammad Nabil Muyassar RahmanRadityo Adi NugrohoMohammad Reza FaisalFriska AbadiRudy Herteno

Year: 2024 Journal:   TELKOMNIKA (Telecommunication Computing Electronics and Control) Vol: 22 (3)Pages: 598-598   Publisher: Ahmad Dahlan University

Abstract

<p>In software defect prediction, noisy attributes and high-dimensional data <br>remain to be a critical challenge. This paper introduces a novel approach <br>known as multi correlation-based feature selection (MCFS), which seeks to <br>address these challenges. MCFS integrates two feature selection techniques, <br>namely correlation-based feature selection (CFS) and correlation matrixbased feature selection (CMFS), intending to reduce data dimensionality and <br>eliminate noisy attributes. To accomplish this, CFS and CMFS are applied <br>independently to filter the datasets, and a weighted average of their <br>outcomes is computed to determine the optimal feature selection. This <br>approach not only reduces data dimensionality but also mitigates the impact <br>of noisy attributes. To further enhance predictive performance, this paper <br>leverages the particle swarm optimization (PSO) algorithm as a feature <br>selection mechanism, specifically targeting improvements in the area under <br>the curve (AUC). The evaluation of the proposed method is conducted on 12 <br>benchmark datasets sourced from the NASA metrics data program (MDP)<br>corpus, renowned for their noisy attributes, high dimensionality, and <br>imbalanced class records. The research findings demonstrate that MCFS <br>outperforms CFS and CMFS, yielding an average AUC value of 0.891, <br>thereby emphasizing it is efficacy in advancing classification performance in <br>the context of software defect prediction using k-nearest neighbors (KNN) <br>classification.</p>

Keywords:
Feature selection Selection (genetic algorithm) Computer science Correlation Feature (linguistics) Software Software bug Artificial intelligence Pattern recognition (psychology) Data mining Machine learning Mathematics Programming language

Metrics

6
Cited By
9.17
FWCI (Field Weighted Citation Impact)
37
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.