JOURNAL ARTICLE

Grid-based subspace clustering over data streams

Abstract

A real-life data stream usually contains many dimensions and some dimensional values of its data elements may be missing. In order to effectively extract the on-going change of a data stream with respect to all the subsets of the dimensions of the data stream, a grid-based subspace clustering algorithm is proposed in this paper. Given an n-dimensional data stream, the on-going distribution statistics of data elements in each one-dimension data space is firstly monitored by a list of grid-cells called a sibling list. Once a dense grid-cell of a first-level sibling list becomes a dense unit grid-cell, new second-level sibling lists are created as its child nodes in order to trace any cluster in all possible two-dimensional rectangular subspaces. In such a way, a sibling tree grows up to the nth level at most and a k-dimensional subcluster can be found in the kth level of the sibling tree. The proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

Keywords:
Cluster analysis Linear subspace Computer science Data stream mining Subspace topology Grid Data mining Tree (set theory) Dimension (graph theory) Data stream Data stream clustering Algorithm Mathematics Correlation clustering CURE data clustering algorithm Artificial intelligence Combinatorics

Metrics

25
Cited By
2.33
FWCI (Field Weighted Citation Impact)
19
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Data Compression Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.