JOURNAL ARTICLE

Enhancing Software Effort Estimation in Healthcare Informatics: A Comparative Analysis of Machine Learning Models with Correlation-Based Feature Selection

Muhammad AbidSama BukhariMuhammad Saqlain

Year: 2025 Journal:   Sustainable Machine Intelligence Journal Vol: 10 Pages: 50-66

Abstract

Software effort estimation is one of the most crucial processes in the management of software projects predominantly related to the healthcare industry. It involves the prediction of efforts needed to develop and endorse different software applications. To render clinical projects on time within the budget range, flawless projection with efficient planning is incumbent. This paper discloses the techniques that utilize machine learning models for ameliorating software effort estimation by using biomedical datasets, including Breast Cancer Wisconsin, COVID-19, Sleepy Drivers EEG Brainwave, Heart Disease Prediction and Food Nutrition. All of these datasets are cleaned and prepared by handling missing values, converting categorical features, and splitting data into training and testing sets and are being trained by four popular machine learning models; Linear Regression, Gradient Boosting, Random Forest, and Decision Tree. Furthermore, correlation based features are selected in the feature matrix to investigate the influence of statistically linked features and to promote reliability. For evaluation and measurement of the effectiveness of these models, two performance metrics namely: R2 and Root Mean Squared Error are employed. The outcomes of the study delineate that Linear Regression and Gradient Boosting models give substantially better results than other models when choosing features on the basis of correlation. R2 scores are strikingly impressive for Food Nutrition, Breast Cancer, COVID-19, while RMSE scores are lowest for COVID-19 dataset, showing high accuracy.

Keywords:
Feature selection Computer science Estimation Feature (linguistics) Informatics Selection (genetic algorithm) Health informatics Machine learning Software Artificial intelligence Correlation Health care Data mining Data science Engineering Mathematics Systems engineering Political science

Metrics

1
Cited By
9.66
FWCI (Field Weighted Citation Impact)
39
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.