Representation learning has been a fruitful area in recent years, driven by the growing interest in deep learning methods. In particular, word representation learning, a.k.a. word embeddings has triggered progress in different natural language processing (NLP) tasks. Despite the success of word embeddings in tasks such as named entity recognition or textual entailment, their use is still embryonic in query expansion. In this work, we examine the usefulness of word embeddings to represent queries and documents in query-document matching tasks. For this purpose, we use a re-ranking strategy. The re-ranking phase is conducted using representations of queries and documents based on word embeddings. We introduce IDF average word embeddings, a new text representation strategy based on word embeddings, which allows us to create a query vector representation that provides higher relevance to informative terms during the process. Experimental results in TREC benchmark datasets show that our proposal consistently achieves the best results in terms of MAP.
Saar KuziAnna ShtokOren Kurland
Elias BassaniNicola TonellottoGabriella Pasi
Fernando DíazBhaskar MitraNick Craswell
Yashen WangHeyan HuangChong Feng