G. Suresh ReddyT. V. RajinikanthA. Ananda Rao
Text clustering is an unsupervised process forming its basis solely on finding the similarity relationship between documents with the output as a set of clusters [14]. In this research, a commonality measure is defined to find commonality between two text files which is used as a similarity measure. The main idea is to apply any existing frequent item finding algorithm such as apriori or fp-tree to the initial set of text files to reduce the dimension of the input text files. A document feature vector is formed for all the documents. Then a vector is formed for all the static text input files. The algorithm outputs a set of clusters from the initial input of text files considered.
Vijay Kumar GuptaMaitreyee DuttaManoj Kumar
Florian W. BeilMartin EsterXiaowei Xu
Harsha PatilRamjeevan Singh Thakur
Manoj KumarDharmendra Kumar YadavVijay Kumar Gupta