Panagiotis A. TraganitisGeorgios B. Giannakis
The immense amount of daily generated and communicated data presents unique\nchallenges in their processing. Clustering, the grouping of data without the\npresence of ground-truth labels, is an important tool for drawing inferences\nfrom data. Subspace clustering (SC) is a relatively recent method that is able\nto successfully classify nonlinearly separable data in a multitude of settings.\nIn spite of their high clustering accuracy, SC methods incur prohibitively high\ncomputational complexity when processing large volumes of high-dimensional\ndata. Inspired by random sketching approaches for dimensionality reduction, the\npresent paper introduces a randomized scheme for SC, termed Sketch-SC, tailored\nfor large volumes of high-dimensional data. Sketch-SC accelerates the\ncomputationally heavy parts of state-of-the-art SC approaches by compressing\nthe data matrix across both dimensions using random projections, thus enabling\nfast and accurate large-scale SC. Performance analysis as well as extensive\nnumerical tests on real data corroborate the potential of Sketch-SC and its\ncompetitive performance relative to state-of-the-art scalable SC approaches.\n
Sai Kiran KadambariSundeep Prabhakar Chepuri
Daniel Pimentel-AlarcónLaura BalzanoRobert D. Nowak
Panagiotis A. TraganitisGeorgios B. Giannakis
Shaoguang HuangHongyan ZhangAleksandra Pižurica