JOURNAL ARTICLE

Document Retrieval Using Entity-Based Language Models

Abstract

We address the ad hoc document retrieval task by devising novel types of entity-based language models. The models utilize information about single terms in the query and documents as well as term sequences marked as entities by some entity-linking tool. The key principle of the language models is accounting, simultaneously, for the uncertainty inherent in the entity-markup process and the balance between using entity-based and term-based information. Empirical evaluation demonstrates the merits of using the language models for retrieval. For example, the performance transcends that of a state-of-the-art term proximity method. We also show that the language models can be effectively used for cluster-based document retrieval and query expansion.

Keywords:
Computer science Information retrieval Markup language Term (time) Natural language processing Query language Document retrieval Key (lock) Task (project management) Query expansion Language model Artificial intelligence Document type definition Document Structure Description XML World Wide Web

Metrics

90
Cited By
14.66
FWCI (Field Weighted Citation Impact)
57
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Information Retrieval and Search Behavior
Physical Sciences →  Computer Science →  Information Systems
Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Utilizing Passage-Based Language Models for Document Retrieval

Michael BenderskyOren Kurland

Lecture notes in computer science Year: 2008 Pages: 162-174
JOURNAL ARTICLE

Utilizing passage-based language models for document retrieval

Michael BenderskyOren Kurland

Journal:   European Conference on Information Retrieval Year: 2008 Pages: 162-174
JOURNAL ARTICLE

Content-based Language Models for Spoken Document Retrieval

Hsin‐Min WangBerlin Chen

Journal:   International Journal of Computer Processing Of Languages Year: 2001 Vol: 14 (02)Pages: 193-209
JOURNAL ARTICLE

Biterm language models for document retrieval

Munirathnam SrikanthRohini K. Srihari

Journal:   Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02 Year: 2002
© 2026 ScienceGate Book Chapters — All rights reserved.