Cardiovascular diseases persist as a prominent global cause of mortality, emphasizing the importance of precise prediction techniques for timely identification and intervention. This research explores the efficacy of three machine learning models—Logistic Regression, K-Nearest Neighbor (KNN) Classifier, and Gaussian Naive Bayes (Gaussian NB)—in predicting heart disease using datasets of varying sizes and test proportions. Leveraging demographic, clinical, and physiological variables such as age, gender, blood pressure, cholesterol levels, fasting blood sugar levels, electrocardiogram results, and maximum heart rate, predictive models were trained and evaluated. Across all dataset with different sizes and test proportions, Gaussian NB consistently demonstrated superior performance, characterized by high accuracy, precision, recall, and F1-score. Logistic Regression also exhibited commendable performance, particularly in terms of simplicity and interpretability, slightly lower than Gaussian NB. In contrast, KNN Classifier displayed varied performance across datasets with different sizes and test proportions, indicating sensitivity to these factors. These findings underscore the potential of machine learning models, particularly Gaussian NB and Logistic Regression, in enhancing heart disease prediction accuracy, thus enabling targeted interventions and personalized treatment strategies for improved patient outcomes. Further optimization and consideration of dataset characteristics are warranted to enhance the performance of KNN Classifier in heart disease prediction tasks.
Saptarsi SanyalDolly DasSaroj Kr. BiswasManomita ChakrabortyBiswajit Purkayastha
Raja Aswathi RPazhani Kumar KB. Ramakrishnan
Sourabh KumarSaroj Kumar Chandra
Saroj Kumar ChandraRam Narayan ShuklaAshok Bhansali