JOURNAL ARTICLE

Class-Level Refactoring Prediction by Ensemble Learning with Various Feature Selection Techniques

Rasmita PanigrahiSanjay Kumar KuanarSanjay MisraLov Kumar

Year: 2022 Journal:   Applied Sciences Vol: 12 (23)Pages: 12217-12217   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Background: Refactoring is changing a software system without affecting the software functionality. The current researchers aim i to identify the appropriate method(s) or class(s) that needs to be refactored in object-oriented software. Ensemble learning helps to reduce prediction errors by amalgamating different classifiers and their respective performances over the original feature data. Other motives are added in this paper regarding several ensemble learners, errors, sampling techniques, and feature selection techniques for refactoring prediction at the class level. Objective: This work aims to develop an ensemble-based refactoring prediction model with structural identification of source code metrics using different feature selection techniques and data sampling techniques to distribute the data uniformly. Our model finds the best classifier after achieving fewer errors during refactoring prediction at the class level. Methodology: At first, our proposed model extracts a total of 125 software metrics computed from object-oriented software systems processed for a robust multi-phased feature selection method encompassing Wilcoxon significant text, Pearson correlation test, and principal component analysis (PCA). The proposed multi-phased feature selection method retains the optimal features characterizing inheritance, size, coupling, cohesion, and complexity. After obtaining the optimal set of software metrics, a novel heterogeneous ensemble classifier is developed using techniques such as ANN-Gradient Descent, ANN-Levenberg Marquardt, ANN-GDX, ANN-Radial Basis Function; support vector machine with different kernel functions such as LSSVM-Linear, LSSVM-Polynomial, LSSVM-RBF, Decision Tree algorithm, Logistic Regression algorithm and extreme learning machine (ELM) model are used as the base classifier. In our paper, we have calculated four different errors i.e., Mean Absolute Error (MAE), Mean magnitude of Relative Error (MORE), Root Mean Square Error (RMSE), and Standard Error of Mean (SEM). Result: In our proposed model, the maximum voting ensemble (MVE) achieves better accuracy, recall, precision, and F-measure values (99.76, 99.93, 98.96, 98.44) as compared to the base trained ensemble (BTE) and it experiences less errors (MAE = 0.0057, MORE = 0.0701, RMSE = 0.0068, and SEM = 0.0107) during its implementation to develop the refactoring model. Conclusions: Our experimental result recommends that MVE with upsampling can be implemented to improve the performance of the refactoring prediction model at the class level. Furthermore, the performance of our model with different data sampling techniques and feature selection techniques has been shown in the form boxplot diagram of accuracy, F-measure, precision, recall, and area under the curve (AUC) parameters.

Keywords:
Computer science Code refactoring Artificial intelligence Feature selection Machine learning Support vector machine Data mining Ensemble learning Classifier (UML) Software Pattern recognition (psychology)

Metrics

9
Cited By
3.04
FWCI (Field Weighted Citation Impact)
68
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
Advanced Malware Detection Techniques
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Heart Disease Prediction using Feature Selection and Ensemble Learning Techniques

A. LakshmanaraoA. SrisailaT. Srinivasa Ravi Kiran

Journal:   2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) Year: 2021 Pages: 994-998
JOURNAL ARTICLE

Prediction of Parkinson's disease using feature selection and ensemble learning techniques

Tanya SharanSujata Joshi

Journal:   Indonesian Journal of Electrical Engineering and Computer Science Year: 2025 Vol: 39 (3)Pages: 1736-1736
© 2026 ScienceGate Book Chapters — All rights reserved.