Sérgio Mário Lins GaldinoJornandes Dias da Silva
In this paper, we describe a hybrid clustering procedure which is well-suited when we deal with a large data set.It combines the K-Means clustering to handle large data sets, and an Interval valued data-type Hierarchical Clustering (IHCA).The Hierarchical Cluster Analysis is especially helpful when we want to detect the appropriate number of clusters.The hybrid clustering procedure relies on the following schema: First, we use the K-Means algorithm in order to create pre-clusters (e.g., 30), they contain a few examples and second, we start the IHAC from these pre-clusters (summarized by interval data vectors-they contain more information than point-valued data, and such informational advantages could be exploited to yield more efficient analysis) to create the dendrogram.The main goal of this paper is show that hybrid cluster analysis is appropriate.A simple case study demonstrates the procedure for combining K-means/IHCA, which finds representative groups and thus, proves the efficiency of approach.
D. S. GuruN. Vinay KumarMahamad Suhil
Pierpaolo D’UrsoPaolo Giordani