JOURNAL ARTICLE

Compressed K-Means for Large-Scale Clustering

Xiaobo ShenWeiwei LiuIvor W. TsangFumin ShenQuansen Sun

Year: 2017 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 31 (1)   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

Large-scale clustering has been widely used in many applications, and has received much attention. Most existing clustering methods suffer from both expensive computation and memory costs when applied to large-scale datasets. In this paper, we propose a novel clustering method, dubbed compressed k-means (CKM), for fast large-scale clustering. Specifically, high-dimensional data are compressed into short binary codes, which are well suited for fast clustering. CKM enjoys two key benefits: 1) storage can be significantly reduced by representing data points as binary codes; 2) distance computation is very efficient using Hamming metric between binary codes. We propose to jointly learn binary codes and clusters within one framework. Extensive experimental results on four large-scale datasets, including two million-scale datasets demonstrate that CKM outperforms the state-of-the-art large-scale clustering methods in terms of both computation and memory cost, while achieving comparable clustering accuracy.

Keywords:
Cluster analysis Computer science Computation Hamming distance Scale (ratio) Binary number CURE data clustering algorithm Data mining Clustering high-dimensional data Correlation clustering Metric (unit) Algorithm Artificial intelligence Mathematics

Metrics

69
Cited By
4.57
FWCI (Field Weighted Citation Impact)
46
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Scalable k-means for large-scale clustering

Yuewei MingEn ZhuMao WangQiang LiuXinwang LiuJianping Yin

Journal:   Intelligent Data Analysis Year: 2019 Vol: 23 (4)Pages: 825-838
JOURNAL ARTICLE

Large scale K-means clustering using GPUs

Mi LiEibe FrankBernhard Pfahringer

Journal:   Data Mining and Knowledge Discovery Year: 2022 Vol: 37 (1)Pages: 67-109
JOURNAL ARTICLE

Large-scale k-means clustering via variance reduction

Yawei ZhaoYuewei MingXinwang LiuEn ZhuKaikai ZhaoJianping Yin

Journal:   Neurocomputing Year: 2018 Vol: 307 Pages: 184-194
© 2026 ScienceGate Book Chapters — All rights reserved.