JOURNAL ARTICLE

Formal Constraints for Structured Document Retrieval

Abstract

The formalization of retrieval constraints for traditional (atomic) retrieval was a major milestone in information retrieval (IR) research. The aim of these constraints was to formalize IR heuristics which most retrieval models rely upon. In a similar fashion, this paper introduces constraints for structured document retrieval (SDR). Out of the many possible constraints, we focus on three that are shown to produce intuitive rankings in simple, but informative retrieval scenarios. It is shown that none of the widely used SDR models (BM25F, MLM, linear score aggregation) satisfy all three constraints. The underlying reason for this is shown to be the failure of existing models to balance between assuming independence of term occurrences across fields and considering the documents as atomic, rather than structured. The constraints introduced in this paper, together with the analysis of how they are satisfied by existing models, can be used to analytically reason about the behaviour of any SDR model in a variety of ranking scenarios.

Keywords:
Computer science Information retrieval

Metrics

1
Cited By
0.20
FWCI (Field Weighted Citation Impact)
8
Refs
0.53
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Information Retrieval and Search Behavior
Physical Sciences →  Computer Science →  Information Systems
Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems

Related Documents

BOOK-CHAPTER

Structured Document Retrieval

Mounia LalmasRicardo Baeza‐Yates

Encyclopedia of Database Systems Year: 2016 Pages: 1-2
BOOK-CHAPTER

Structured Document Retrieval

Mounia LalmasRicardo Baeza‐Yates

Encyclopedia of Database Systems Year: 2009 Pages: 2867-2868
BOOK-CHAPTER

Structured Document Retrieval

Mounia LalmasRicardo Baeza‐Yates

Encyclopedia of Database Systems Year: 2018 Pages: 3827-3829
BOOK-CHAPTER

Focussed Structured Document Retrieval

Gabrialla KazaiMounia LalmasThomas Roelleke

Lecture notes in computer science Year: 2002 Pages: 241-247
DISSERTATION

Precise document retrieval in structured domains

Tam, Wai Lap Vincent

University:   UNSWorks (University of New South Wales, Sydney, Australia) Year: 2009
© 2026 ScienceGate Book Chapters — All rights reserved.