Users have specific information needs which are expressed in short queries to information retrieval systems. The queries are unstructured, and they tend to be short and ambiguous in most cases. Using the shallow language statistics including probabilistic or language models such as BM25 or Indri respectively can enhance the retrieval system metrics like Mean Average Precision (MAP). However, such methods depend on query terms and their presence in the retrieved document to define relevance. Query expansion is a technique that can be used to overcome this problem by expanding the query with terms from an initial top few relevant documents. The question that we try to answer is whether the quality of the corpus used for expansion produce a significant improvement MAP and precision at top 30 retrieved documents. We show that the quality and the selection criteria of expansion documents are important factors in query expansion performance.
Andisheh KeikhaFaezeh EnsanEbrahim Bagheri
Yang XuGareth J. F. JonesBin Wang
Husni HusniYeni KustiyahningsihFika Hastarita RachmanEka Mala Sari RochmanHadi Yulian
Karthik RamanRaghavendra UdupaPushpak BhattacharyaAbhijit Bhole