JOURNAL ARTICLE

Improving query expansion strategies with word embeddings

Abstract

Representation learning has been a fruitful area in recent years, driven by the growing interest in deep learning methods. In particular, word representation learning, a.k.a. word embeddings has triggered progress in different natural language processing (NLP) tasks. Despite the success of word embeddings in tasks such as named entity recognition or textual entailment, their use is still embryonic in query expansion. In this work, we examine the usefulness of word embeddings to represent queries and documents in query-document matching tasks. For this purpose, we use a re-ranking strategy. The re-ranking phase is conducted using representations of queries and documents based on word embeddings. We introduce IDF average word embeddings, a new text representation strategy based on word embeddings, which allows us to create a query vector representation that provides higher relevance to informative terms during the process. Experimental results in TREC benchmark datasets show that our proposal consistently achieves the best results in terms of MAP.

Keywords:
Computer science Ranking (information retrieval) Word (group theory) Natural language processing Query expansion Artificial intelligence Relevance (law) Representation (politics) Benchmark (surveying) Matching (statistics) Information retrieval Process (computing) Linguistics Mathematics

Metrics

10
Cited By
1.03
FWCI (Field Weighted Citation Impact)
6
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Quality and Management
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.