Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is why distributed data mining techniques have been introduced. One of the most important data mining problems is data clustering. While many clustering algorithms exist for centralized databases, there is a lack of efficient algorithms for distributed databases. In this paper, an efficient algorithm is proposed for clustering distributed databases. The proposed methodology employs an iterative optimization technique to achieve better clustering objective. The experimental results reported in this paper show the superiority of the proposed technique over a recently proposed algorithm based on a distributed version of the well known K-Means algorithm (Datta et al. 2009) [1].
Jin ZhouLong ChenC. L. Philip ChenYingxu WangHan‐Xiong Li
Ke ZuoDongmin HuHuaimin WangQuanyuan WuLiang Su
Stefano LodiGabriele MontiGianluca MoroClaudio Sartori
Stefano LodiGabriele MontiGianluca MoroClaudio Sartori
He LiKyoung Soo BokJae Soo Yoo