JOURNAL ARTICLE

Querying XML documents with multi-dimensional markup

Abstract

XML documents annotated by different NLP tools accommodate multi-dimensional markup in a single hierarchy. To query such documents one has to account for different possible nesting structures of the annotations and the original markup of a document. We propose an expressive pattern language with extended semantics of the sequence pattern, supporting negation, permutation and regular patterns that is especially appropriate for querying XML annotated documents with multi-dimensional markup. The concept of fuzzy matching allows matching of sequences that contain textual fragments and known XML elements independently of how concurrent annotations and original markup are merged. We extend the usual notion of sequence as a sequence of siblings allowing matching of sequence elements on the different levels of nesting and abstract so from the hierarchy of the XML document. Extended sequence semantics in combination with other language patterns allows more powerful and expressive queries than queries based on regular patterns.

Keywords:
Computer science Markup language SGML Document type definition XML Document Structure Description RuleML Information retrieval Sequence (biology) Programming language Natural language processing World Wide Web

Metrics

2
Cited By
0.42
FWCI (Field Weighted Citation Impact)
17
Refs
0.60
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Database Systems and Queries
Physical Sciences →  Computer Science →  Computer Networks and Communications
Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.