JOURNAL ARTICLE

Breast Cancer Detection by Data visualization and Feature selection using XG Boost Algorithm

Abstract

All around the world, the most common type of cancer diagnosed in women is breast cancer. This type of cancer start occurring in glandular tissue which is called lobules or else other parts of breast tissues. It is very important to detect cancer as early as possible. Tumours are of two types cancerous and non-cancerous commonly known as malignant and benign. In this paper, the Wisconsin Breast cancer data set has been used. It is a tabular form of data set. The prime goal is to visualize the data that we have and then select the best features. After getting all the best features will apply all the machine learning algorithms like KNN, Random Forest, Decision Tree, Naive Bayes, SVM, Logistic regression, and XG boost. Classifiers can help us to build a system that will help to detect breast cancer soon in women. XG Boost algorithm outperforms the other algorithms on our selected feature. It gives an accuracy of 98.75%.

Keywords:
Random forest Feature selection Naive Bayes classifier Breast cancer Computer science Decision tree Support vector machine Artificial intelligence Cancer Algorithm Machine learning Feature (linguistics) Data set Statistical classification Logistic regression Tree (set theory) Set (abstract data type) Pattern recognition (psychology) Mathematics Medicine Internal medicine

Metrics

5
Cited By
0.98
FWCI (Field Weighted Citation Impact)
11
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Artificial Intelligence in Healthcare
Health Sciences →  Health Professions →  Health Information Management
Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.