N. V. SobhanaSaptarshi GhoshPabitra Mitra
An important research field in text mining is Entity Relation Extraction. Extracting various relations between geological entities is of immense benefit to developing intelligent search tools for geology researchers. In this paper Conditional Random Fields (CRFs) as well as sequence kernels are used for extracting relations between entities from a geological corpus. A geological corpus was developed from a collection of scientific reports and articles on the geology of the Indian subcontinent. The training set, consisting of more than 200K words, has been annotated with a named entity tag set of seventeen tags and with labeled instances of part-of and nearby relations. The system is able to recognize part-of and near-by relations with 71.57% and 77.27% F-measure values for T-CRF, and 78.25% and 83.71% for subsequence kernels. The extracted relations were used for query expansion in a retrieval system to achieve a gain of 10.86% for T-CRF, and 10.58% for subsequence kernels over the baseline Mean Average Precision.
Rathany Chan SamHuong Thanh LeThuy Thanh NguyenTerry Trinh
Xujun PengHuaigu CaoRohit PrasadPrem Natarajan
Suxiang ZhangSuxian ZhangGuo-Yang Gao
Jana DiesnerKathleen M. Carley
Yingying ChenQinghua ZhengWei WangYan Chen