Most real-world data sets are characterized by a high dimensinal, inherely sparse data space. In this paper, we present a novel density-based approach to the subspace clustering problem. A new framework for data stream mining is introduced, called the weighted sliding window. In the online component, the structure of Exponential Histogram of Cluster Feature(EHCF) is improved to maintain the micro-clusters. The concepts of potential core-micro-cluster and outlier micro-cluster are applied to distinguish the potential clusters and outliers. A novel pruning strategy is proposed to decrease the number of micro-clusters. In the offline component, the final clusters are generated by SUBCLU algorithm. Our performance study demonstrates the effectiveness and efficiency of our algorithm.
K. Shyam Sunder ReddyC. Shoba Bindu
Ta Minh ThuyHoai An Le ThiLydia Boudjeloud-Assala