Term relations analysis has been used to improve performance in information retrieval. However, it is difficult to choose the appropriate related terms. Co-occurrence analysis and WordNet have been used to obtain mutual information between terms in re-ranking retrieval results and performing query expansion, but it didn't improve the performance as expected. It is difficult to avoid involving noise information and inappropriate related terms with ambiguous sense in the process of finding related terms and computing mutual information. To solve this problem, we propose to add context information in a document when choosing related terms by clustering method, and use Mahalanobis distance instead of Euclidean distance in re-ranking query result with term mutual information. The approach presented in this paper can improve the precision and relevance in enterprise information retrieval significantly to satisfy user's needs.
Xuehua ShenBin TanChengXiang Zhai
Mordechai AuerbuchTom H. KarsonBenjamin Ben-AmiOded MaimonLior Rokach
Binil KuriachanGopikrishna YadamLakshmi Dinesh