Cardiovascular diseases (CVDs) are group disorders affecting the heart or involving constricted blood arteries. Early disease detection increases the likelihood of survival. As a result, newer methods such as machine learning emerged, capable of processing and analysing vast quantities of complex medical data and providing a more accurate prediction of diseases, including CVDs. However, due to factors such as overfitting and bias, the single classifier could not ensure optimum prediction. Thus, this study proposes a stacking ensemble classifier, which combines several single classifiers to produce an optimal predictive model. The Framingham Heart Study dataset was used to train the machine learning algorithms. The exploratory data analysis indicates that CVD was more common in males and diabetic individuals. Furthermore, individuals above the age of 65 were more susceptible to CVDs. Feature selection, missing value imputation, and data sampling were performed as part of data preprocessing. The results show that the proposed stacked ensemble classifier achieved 88.33% accuracy, 89.95% precision, 86.27% recall and 88.07% F1-score. Furthermore, the significance test results indicate that the proposed model performs significantly better than most models evaluated in this research. Finally, the comparative analysis showed that the proposed ensemble classifier performs better than most studies using the same dataset. The proposed model achieved a high F1-score, indicating it can accurately predict the cases with and without CVD.
T. KavithaKammara TrivikramManobhi Ram Reddy BS ShashankH SushmithaB Rahul
Sai Charan MedaramatlaChennupati Veda SamhithaK. Srinivasa Reddy
Sharwari AmbadeDiptee Chikmurge
Saurabh VermaRenu DhirMohit Kumar