Abstract

The set of objects having same characteristics are organized in groups and clusters of these objects reformed known as Data Clustering. It is an unsupervised learning technique for classification of data. K-means algorithm is widely used and famous algorithm for analysis of clusters. In this algorithm, n number of data points are divided into k clusters based on some similarity measurement criterion. K-Means Algorithm has fast speed and thus is used commonly clustering algorithm. Vector quantization, cluster analysis, feature learning are some of the application of K-Means. However results generated using this algorithm are mainly dependant on choosing initial cluster centroids. The main short come of this algorithm is to provide appropriate number of clusters. Provision of number of clusters before applying the algorithm is highly impractical and requires deep knowledge of clustering field. In this project, we are going to propose an algorithm for improvement in the initializing the centroids for K-Means algorithm. We are going to work on numerical data sets along with the categorical datasets with the n dimensions. For similarity measurement we are going to consider the Manhattan distance,Dice distance and cosine distance. The result of this proposed algorithm will be compared with the original K-Means. Also the quality and complexity of the proposed algorithm will be checked with the existing algorithm.

Keywords:
Cluster analysis Canopy clustering algorithm Computer science Algorithm Linde–Buzo–Gray algorithm CURE data clustering algorithm Determining the number of clusters in a data set Fuzzy clustering Vector quantization Correlation clustering Single-linkage clustering Pattern recognition (psychology) Data mining Data stream clustering k-medoids Artificial intelligence

Metrics

22
Cited By
2.51
FWCI (Field Weighted Citation Impact)
9
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Data Compression Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Improvement of K-Means clustering Algorithm

PRAMILA CHAWAN

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2012
JOURNAL ARTICLE

Improvement of K-Means clustering Algorithm

PRAMILA CHAWAN

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2012
JOURNAL ARTICLE

Organization clustering airports using K-Means clustering algorithm

Dyah Lintang TrenggonowatiMaria UlfahRatna EkawatiVira Aleyda Yusuf

Journal:   IOP Conference Series Materials Science and Engineering Year: 2019 Vol: 673 (1)Pages: 012081-012081
BOOK-CHAPTER

Clustering Biological Data Using Enhanced k-Means Algorithm

K. A. Abdul NazeerM. P. Sebastian

Lecture notes in electrical engineering Year: 2010 Pages: 433-442
© 2026 ScienceGate Book Chapters — All rights reserved.