JOURNAL ARTICLE

Answering top-k representative queries on graph databases

Abstract

Given a function that classifies a data object as relevant or irrelevant, we consider the task of selecting k objects that best represent all relevant objects in the underlying database. This problem occurs naturally when analysts want to familiarize themselves with the relevant objects in a database using a small set of k exemplars. In this paper, we solve the problem of top-k representative queries on graph databases. While graph databases model a wide range of scientific data, solving the problem in the context of graphs presents us with unique challenges due to the inherent complexity of matching structures. Furthermore, top-k representative queries map to the classic Set Cover problem, making it NP-hard. To overcome these challenges, we develop a greedy approximation with theoretical guarantees on the quality of the answer set, noting that a better approximation is not feasible in polynomial time. To further optimize the quadratic computational cost of the greedy algorithm, we propose an index structure called NB-Index to index the \theta-neighborhoods of the database graphs by employing a novel combination of Lipschitz embedding and agglomerative clustering. Extensive experiments on real graph datasets validate the efficiency and effectiveness of the proposed techniques that achieve up to two orders of magnitude speed-up over state-of-the-art algorithms.

Keywords:
Computer science Graph database Embedding Cluster analysis Theoretical computer science Greedy algorithm Graph Approximation algorithm Database Data mining Algorithm Artificial intelligence

Metrics

33
Cited By
3.86
FWCI (Field Weighted Citation Impact)
30
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Graph Theory and Algorithms
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing
Complexity and Algorithms in Graphs
Physical Sciences →  Computer Science →  Computational Theory and Mathematics

Related Documents

JOURNAL ARTICLE

Answering Top-$k$ Graph Similarity Queries in Graph Databases

Yuanyuan ZhuLu QinJeffrey Xu YuHong Cheng

Journal:   IEEE Transactions on Knowledge and Data Engineering Year: 2019 Vol: 32 (8)Pages: 1459-1474
JOURNAL ARTICLE

Answering Top-k Keyword Queries on Relational Databases

Myint TheinMie Mie Su Thwin

Journal:   International Journal of Information Retrieval Research Year: 2012 Vol: 2 (3)Pages: 36-57
BOOK-CHAPTER

Top-k Differential Queries in Graph Databases

Elena VasilyevaMaik ThieleChristof BornhövdWolfgang Lehner

Lecture notes in computer science Year: 2014 Pages: 112-125
BOOK-CHAPTER

Preference-Based Top-k Representative Skyline Queries on Uncertain Databases

Hà Thanh NguyễnJinli Cao

Lecture notes in computer science Year: 2015 Pages: 280-292
© 2026 ScienceGate Book Chapters — All rights reserved.