Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms

Moni Kishore Dhar; S. M. Nahid Hasan; Tahsin Rahaman Otushi; Musharrat Khan

doi:10.1109/icrcicn50933.2020.9296186

ScienceGate Book Chapters

JOURNAL ARTICLE

Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms

Moni Kishore Dhar S. M. Nahid Hasan Tahsin Rahaman Otushi Musharrat Khan

Year: 2020 Pages: 36-40

DOI: 10.1109/icrcicn50933.2020.9296186

Get Full-Text PDF Get Analytical Report

Abstract

Clustering method splits a large dataset into smaller subsets, where each subset is called a cluster. Every cluster has the same characteristics and each cluster is different from all other clusters. The most common clustering algorithms are the k-Means clustering algorithm and the k-Medoids clustering algorithm. Clustering of high-dimensional dataset may become difficult. To overcome the problem, dimension of the dataset is reduced. In the present work, we reduce dimension of a dataset by selecting suitable subset of features using entropy-based method. We calculate entropy using both Euclidean and Manhattan distances. We experiment with three widely used datasets from the Machine Learning Repository of the University of California, Irvine (UCI). From the results of experimentation, we can conclude that our approach produces higher clustering accuracies than those of previous works.

Keywords:

Cluster analysis Computer science Entropy (arrow of time) k-medians clustering CURE data clustering algorithm Correlation clustering Single-linkage clustering k-medoids Euclidean distance Clustering high-dimensional data Data mining Pattern recognition (psychology) Medoid Artificial intelligence Fuzzy clustering Canopy clustering algorithm Determining the number of clusters in a data set k-means clustering

Metrics

Cited By

0.44

FWCI (Field Weighted Citation Impact)

Refs

0.70

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research

Physical Sciences → Computer Science → Artificial Intelligence

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms

Abstract

Metrics

Citation History

Topics

Related Documents

Data Clustering Using Hybrid Genetic Algorithm with k-Means and k-Medoids Algorithms

IMPROVED PARALLEL BIG DATA CLUSTERING BASED ON K-MEDOIDS AND K-MEANS ALGORITHMS

Clustering Lung Cancer Data by k-Means and k- Medoids Algorithms

Comparison between K-Means and K-Medoids Clustering Algorithms

Laboratory Clustering using K-Means, K-Medoids, and Model-Based Clustering