The K-Nearest Neighbor Algorithm (K-NN) is an important approach for automatic text classification. In this paper, cluster was applied In order to overcome the disadvantages of the traditional K-NN algorithm. First Clustering was utilized in training set through an improved K-mean approach to select the most representative samples as cluster center. Then we compute the comparability between the testing samples and the central vector of each cluster. A K-NN algorithm based on cluster was presented. The experiment results verify that this classification algorithm is much faster than the traditional K-NN algorithm, and it can raise the accuracy.
Xianfei ZhangBicheng LiXianzhu Sun
Hyung-Seok KangKihyo NamSeong-in Kim
Yulong QiaoJeng‐Shyang PanSun Sheng-he