Yufang LiuShibin XiaoXueqiang LvShuicai Shi
Through research on K-means algorithm of text clustering and semantic-based vector space model, a semantic-based K-means text clustering model is proposed to solve the problem on high-dimensional and sparse characteristics of text data set. The model reduces the semantic loss of the text data and improves the quality of text clustering. Experiments prove that semantic-based text clustering increases by more 6 percent than non-semantic-based one in the final evaluation of the F1 index value.
Xiuguo ChenWensheng YinPinghui TuHengxi Zhang
Mingyu YaoDechang PiXiangxiang Cong