JOURNAL ARTICLE

A k-Means-Based Projected Clustering Algorithm

Abstract

In high dimensional data space, clusters are likely to exist in different subspaces. K-means is a classic clustering algorithm, but it cannot be used to find subspace clusters. In this paper, an algorithm called GKM is designed to generalize k-means algorithm for high dimensional data. In the objective function of GKM, we associate a weight vector with each cluster to indicate which dimensions are relevant to this cluster. To prevent the value of the objective function from decreasing because of the elimination of dimensions, virtual dimensions are added to the objective function. The values of data points on virtual dimensions are set artificially to ensure that the objective function is minimized when the real subspace clusters or the clusters in original space are found. Algorithm GKM preserves the advantages of k-means. It can identify subspace clusters with linear time complexity. Our performance study with a synthetic dataset and a real dataset demonstrates the efficiency and effectiveness of GKM.

Keywords:
Cluster analysis Computer science Algorithm Data mining Artificial intelligence

Metrics

6
Cited By
0.80
FWCI (Field Weighted Citation Impact)
12
Refs
0.80
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.