JOURNAL ARTICLE

Performance Investigation of Feature Selection Based on Random Forest in Heart Disease Prediction Using KNN Model

Seong G. Kong

Year: 2024 Journal:   Science and Technology of Engineering Chemistry and Environmental Protection Vol: 1 (10)

Abstract

Heart disease is one of the leading causes of death worldwide, claiming millions of lives each year. To address this serious public health challenge, early prediction of heart disease using machine learning techniques has become a hot topic of research. This study explores the impact of different numbers of features on the performance of the K-Nearest Neighbors (KNN) model in predicting heart disease. Initially, a random forest algorithm was employed to rank the importance of a large set of features and identify the key factors most influential in predicting heart disease. Subsequently, starting with the most important features, the study incrementally increased the number of features applied to the KNN model, comparing the model’s accuracy and recall across different feature combinations. The results show that as the number of features increases, the model’s predictive performance does not consistently improve. When the number of features is initially increased, accuracy experiences a sharp decline; although it slightly recovers later, the overall performance does not return to the high level observed with fewer features. Meanwhile, recall significantly improves when the number of features first increases but then starts to fluctuate and noticeably decreases when a certain number of features is reached. This study demonstrates that simply increasing the number of features does not guarantee improved model performance; instead, it may introduce redundant information or noise, weakening the model’s effectiveness.

Keywords:
Random forest Feature selection Recall Computer science Feature (linguistics) Machine learning Artificial intelligence Set (abstract data type) Heart disease Rank (graph theory) Selection (genetic algorithm) Precision and recall Data mining Pattern recognition (psychology) Medicine Mathematics Psychology Cognitive psychology Internal medicine

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.37
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Artificial Intelligence in Healthcare
Health Sciences →  Health Professions →  Health Information Management

Related Documents

JOURNAL ARTICLE

Prediction of Heart Disease Using Random Forest and Rough Set Based Feature Selection

Indu YekkalaSunanda Dixit

Journal:   International Journal of Big Data and Analytics in Healthcare Year: 2018 Vol: 3 (1)Pages: 1-12
BOOK-CHAPTER

Prediction of Heart Disease Using Random Forest and Feature Subset Selection

M. A. JabbarB. L. DeekshatuluPriti Chandra

Advances in intelligent systems and computing Year: 2015 Pages: 187-196
JOURNAL ARTICLE

Prediction of Heart Disease Using Feature Selection and Random Forest Ensemble Method

Journal:   International Journal of Pharmaceutical Research Year: 2020 Vol: 12 (04)
JOURNAL ARTICLE

Sequential feature selection for heart disease detection using random forest

Tsehay Admassu AssegieTamilarasiN .Komal Kumar

Journal:   Iraqi Journal of Science Year: 2022 Pages: 3947-3953
© 2026 ScienceGate Book Chapters — All rights reserved.