JOURNAL ARTICLE

Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms

Abstract

Clustering method splits a large dataset into smaller subsets, where each subset is called a cluster. Every cluster has the same characteristics and each cluster is different from all other clusters. The most common clustering algorithms are the k-Means clustering algorithm and the k-Medoids clustering algorithm. Clustering of high-dimensional dataset may become difficult. To overcome the problem, dimension of the dataset is reduced. In the present work, we reduce dimension of a dataset by selecting suitable subset of features using entropy-based method. We calculate entropy using both Euclidean and Manhattan distances. We experiment with three widely used datasets from the Machine Learning Repository of the University of California, Irvine (UCI). From the results of experimentation, we can conclude that our approach produces higher clustering accuracies than those of previous works.

Keywords:
Cluster analysis Computer science Entropy (arrow of time) k-medians clustering CURE data clustering algorithm Correlation clustering Single-linkage clustering k-medoids Euclidean distance Clustering high-dimensional data Data mining Pattern recognition (psychology) Medoid Artificial intelligence Fuzzy clustering Canopy clustering algorithm Determining the number of clusters in a data set k-means clustering

Metrics

6
Cited By
0.44
FWCI (Field Weighted Citation Impact)
8
Refs
0.70
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

IMPROVED PARALLEL BIG DATA CLUSTERING BASED ON K-MEDOIDS AND K-MEANS ALGORITHMS

Rasim AlguliyevRamiz M. AliguliyevLyudmila Sukhostat

Journal:   Problems of Information Technology Year: 2024 Vol: 15 (1)Pages: 18-25
JOURNAL ARTICLE

Clustering Lung Cancer Data by k-Means and k- Medoids Algorithms

T. VelmuruganD. Aravindh

Journal:   International Journal of Data Mining Techniques and Applications Year: 2014 Vol: 3 (2)Pages: 95-98
BOOK-CHAPTER

Comparison between K-Means and K-Medoids Clustering Algorithms

T. Soni Madhulatha

Communications in computer and information science Year: 2011 Pages: 472-481
JOURNAL ARTICLE

Laboratory Clustering using K-Means, K-Medoids, and Model-Based Clustering

Niswatul Qona’ahAlvita Rachma DeviI Made Gde Meranggi Dana

Journal:   Indonesian Journal of Applied Statistics Year: 2020 Vol: 3 (1)Pages: 64-64
© 2026 ScienceGate Book Chapters — All rights reserved.