Fast K-Means Algorithm Clustering

Raied Salman; Vojislav Kecman; Qi Li; Robert Strack; Erik Test

doi:10.5121/ijcnc.2011.3402

ScienceGate Book Chapters

JOURNAL ARTICLE

Fast K-Means Algorithm Clustering

Raied Salman Vojislav Kecman Qi Li Robert Strack Erik Test

Year: 2011 Journal: International journal of Computer Networks & Communications Vol: 3 (4)Pages: 17-31

DOI: 10.5121/ijcnc.2011.3402

Get Full-Text PDF Get Analytical Report

Abstract

k-means has recently been recognized as one of the best algorithms for\nclustering unsupervised data. Since k-means depends mainly on distance\ncalculation between all data points and the centers, the time cost will be high\nwhen the size of the dataset is large (for example more than 500millions of\npoints). We propose a two stage algorithm to reduce the time cost of distance\ncalculation for huge datasets. The first stage is a fast distance calculation\nusing only a small portion of the data to produce the best possible location of\nthe centers. The second stage is a slow distance calculation in which the\ninitial centers used are taken from the first stage. The fast and slow stages\nrepresent the speed of the movement of the centers. In the slow stage, the\nwhole dataset can be used to get the exact location of the centers. The time\ncost of the distance calculation for the fast stage is very low due to the\nsmall size of the training data chosen. The time cost of the distance\ncalculation for the slow stage is also minimized due to small number of\niterations. Different initial locations of the clusters have been used during\nthe test of the proposed algorithms. For large datasets, experiments show that\nthe 2-stage clustering method achieves better speed-up (1-9 times).\n

Keywords:

Cluster analysis Computer science Stage (stratigraphy) Algorithm Data point Data mining Artificial intelligence

Metrics

Cited By

0.39

FWCI (Field Weighted Citation Impact)

Refs

0.71

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research

Physical Sciences → Computer Science → Artificial Intelligence

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

Face and Expression Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Fast K-Means Algorithm Clustering

Abstract

Metrics

Citation History

Topics

Related Documents

A FAST K-MEANS TYPE CLUSTERING ALGORITHM

Bilateral k-Means Algorithm for Fast Co-Clustering

Fast K-Means Clustering Algorithm using Prediction Data

Weighted bilateral K-means algorithm for fast co-clustering and fast spectral clustering

A Highly Efficient Fast Global K-Means Clustering Algorithm