Imran AminSetyawan WibisonoEndang LestariningsihMuhammad Lutfi
This study evaluates ensemble learning techniques for optimizing heart disease prediction, with a focus on Random Forest due to its robustness in handling complex medical data. The dataset used, "Heart Disease Prediction Dataset" from Kaggle, consists of 270 instances and 13 features like age, cholesterol, and family history. Data preprocessing involved mean imputation for missing values and min-max normalization. The study compares Random Forest with other ensemble classifiers—AdaBoost, Gradient Boosting, and XGBoost—using 10-fold cross-validation and evaluation metrics such as accuracy, precision, recall, and F1 score. Results show that Random Forest outperforms the other models with an accuracy of 87.04%, precision of 85.00%, recall of 80.95%, and F1 score of 82.93%. These findings emphasize Random Forest's ability to maintain prediction stability across various medical attributes and imbalanced data. Although the study highlights Random Forest as a promising method for early heart disease risk prediction, it remains a computational evaluation and requires clinical validation. The results aim to inform the development of predictive tools for enhancing early diagnosis and preventive strategies in healthcare systems.
Vivek TomarJyoti RawatAjey Kumar PathakSanjeev Thakur
Tanishq SoniDeepali GuptaMudita Uppal
Nivyn BybinParvathy GopanK RamkrishnaRyan Sebastain JimmyAnu EldhoRotney Roy Meckmalil