JOURNAL ARTICLE

Feature Selection Using Firefly Algorithm With Tree-Based Classification In Software Defect Prediction

Vina MaulidaRudy HertenoDwi KartiniFriska AbadiMohammad Reza Faisal

Year: 2023 Journal:   Journal of Electronics Electromedical Engineering and Medical Informatics Vol: 5 (4)

Abstract

Defects that occur in software products are a universal occurrence. Software defect prediction is usually carried out to determine the performance, accuracy, precision and performance of the prediction model or method used in research, using various kinds of datasets. Software defect prediction is one of the Software Engineering studies that is of great concern to researchers. This research was conducted to determine the performance of tree-based classification algorithms including Decision Trees, Random Forests and Deep Forests without using feature selection and using firefly feature selection. And also know the tree-based classification algorithm with firefly feature selection which can provide better software defect prediction performance. The dataset used in this study is the ReLink dataset which consists of Apache, Safe and Zxing. Then the data is divided into testing data and training data with 10-fold cross validation. Then feature selection is performed using the Firefly Algorithm. Each ReLink dataset will be processed by each tree-based classification algorithm, namely Decision Tree, Random Forest and Deep Forest according to the results of the firefly feature selection. Performance evaluation uses the AUC value (Area under the ROC Curve). Research was conducted using google collab and the average AUC value generated by Firefly-Decision Tree is 0.66, the average AUC value generated by Firefly-Random Forest is 0.77, and the average AUC value generated by Firefly-Deep Forest is 0, 76. The results of this study indicate that the approach using the Firefly algorithm with Random Forest classification can work better in predicting software damage compared to other tree-based algorithms. In previous studies, tree-based classification with hyperparameter tuning on software defect prediction datasets obtained quite good results. In another study, the classification performance of SVM, Naïve Bayes and K-nearest neighbor with firefly feature selection resulted in improved performance. Therefore, this research was conducted to determine the performance of a tree-based algorithm using the firefly selection feature.

Keywords:
Firefly algorithm Random forest Feature selection Computer science Artificial intelligence Decision tree Tree (set theory) Feature (linguistics) Software Machine learning Data mining Selection (genetic algorithm) Firefly protocol Statistical classification Pattern recognition (psychology) Mathematics

Metrics

3
Cited By
1.86
FWCI (Field Weighted Citation Impact)
27
Refs
0.87
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining and Machine Learning Applications
Physical Sciences →  Computer Science →  Information Systems
Edcuational Technology Systems
Physical Sciences →  Computer Science →  Artificial Intelligence
Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.