In text mining most techniques depends on statistical analysis of terms.Statistical analysis trances important terms within document only.However this concept based mining model analyses terms in sentence, document and corpus level.This mining model consist of sentence based concept analysis, document based and corpus based concept analysis and concept based similarity measure.Similarity based on matching of concepts between document pairs, is shown to have a more significant effect on the clustering quality due to the similarity's insensitivity to noisy terms that can lead to an incorrect similarity.The concepts are less sensitive to noise when it comes to calculating document similarity.Usually, in text mining techniques, the term frequency of a term (word or phrase) is computed to explore the importance of the term in the web document.However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term.Experimental result enhances text clustering quality by using sentence, document, corpus and combined approach of concept analysis.
Shady ShehataFakhri KarrayMohamed S. Kamel
Shady ShehataFakhri KarrayMohamed S. Kamel
Kritarth PrasadK. Swarupa Rani
Ali MansourJuman MohammadYury Kravchenko