JOURNAL ARTICLE

Evaluation of Feature Selection Techniques for Breast Cancer Risk Prediction

Nahúm Cueto LópezMaría Teresa Garcí­a-OrdásFacundo Vitelli‐StorelliPablo Fernández‐NavarroCamilo PalazuelosRocío Aláiz-Rodríguez

Year: 2021 Journal:   International Journal of Environmental Research and Public Health Vol: 18 (20)Pages: 10670-10670   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

This study evaluates several feature ranking techniques together with some classifiers based on machine learning to identify relevant factors regarding the probability of contracting breast cancer and improve the performance of risk prediction models for breast cancer in a healthy population. The dataset with 919 cases and 946 controls comes from the MCC-Spain study and includes only environmental and genetic features. Breast cancer is a major public health problem. Our aim is to analyze which factors in the cancer risk prediction model are the most important for breast cancer prediction. Likewise, quantifying the stability of feature selection methods becomes essential before trying to gain insight into the data. This paper assesses several feature selection algorithms in terms of performance for a set of predictive models. Furthermore, their robustness is quantified to analyze both the similarity between the feature selection rankings and their own stability. The ranking provided by the SVM-RFE approach leads to the best performance in terms of the area under the ROC curve (AUC) metric. Top-47 ranked features obtained with this approach fed to the Logistic Regression classifier achieve an AUC = 0.616. This means an improvement of 5.8% in comparison with the full feature set. Furthermore, the SVM-RFE ranking technique turned out to be highly stable (as well as Random Forest), whereas relief and the wrapper approaches are quite unstable. This study demonstrates that the stability and performance of the model should be studied together as Random Forest and SVM-RFE turned out to be the most stable algorithms, but in terms of model performance SVM-RFE outperforms Random Forest.

Keywords:
Feature selection Random forest Support vector machine Artificial intelligence Computer science Machine learning Ranking (information retrieval) Breast cancer Logistic regression Robustness (evolution) Receiver operating characteristic Classifier (UML) Metric (unit) Data mining Cancer Medicine Engineering Biology Internal medicine

Metrics

23
Cited By
2.12
FWCI (Field Weighted Citation Impact)
66
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Nutritional Studies and Diet
Health Sciences →  Medicine →  Public Health, Environmental and Occupational Health

Related Documents

JOURNAL ARTICLE

Comparative Study of Feature Selection Techniques for Breast Cancer Prediction

Pooleriveetil Padikkal AnaghaT. Sajana

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2025 Vol: 13 (11)Pages: 2301-2305
JOURNAL ARTICLE

Feature Selection based Breast Cancer Prediction

Rakibul HasanAamir Shafi

Journal:   International Journal of Image Graphics and Signal Processing Year: 2023 Vol: 15 (2)Pages: 13-23
JOURNAL ARTICLE

Breast cancer diagnosis using feature selection techniques

Sabrine TounsiImen KallelMohamed Kallel

Journal:   2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) Year: 2022 Pages: 1-5
BOOK-CHAPTER

Breast Cancer Prediction: Importance of Feature Selection

Prateek Prateek

Advances in intelligent systems and computing Year: 2019 Pages: 733-742
JOURNAL ARTICLE

Analysis and prediction of breast cancer through feature selection and classification techniques

E. SivasankarA. Sathish KumarJ. SanjiviP. Balasubramanian

Journal:   International Journal of Medical Engineering and Informatics Year: 2021 Vol: 13 (5)Pages: 359-359
© 2026 ScienceGate Book Chapters — All rights reserved.