JOURNAL ARTICLE

Diversity-Aware Top-k Publish/Subscribe for Text Stream

Abstract

Massive amount of text data are being generated by a huge number of web users at an unprecedented scale. These data cover a wide range of topics. Users are interested in receiving a few up-to-date representative documents (e.g., tweets) that can provide them with a wide coverage of different aspects of their query topics. To address the problem, we consider the Diversity-Aware Top-k Subscription (DAS) query. Given a DAS query, we continuously maintain an up-to-date result set that contains k most recently returned documents over a text stream for the query. The DAS query takes into account text relevance, document recency, and result diversity. We propose a novel solution to efficiently processing a large number of DAS queries over a stream of documents. We demonstrate the efficiency of our approach on real-world dataset and the experimental results show that our solution is able to achieve a reduction of the processing time by 60--75% compared with two baselines. We also study the effectiveness of the DAS query.

Keywords:
Computer science Information retrieval Web query classification Set (abstract data type) Web search query Relevance (law) Publication Query expansion Cover (algebra) World Wide Web Search engine

Metrics

48
Cited By
7.67
FWCI (Field Weighted Citation Impact)
40
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Caching and Content Delivery
Physical Sciences →  Computer Science →  Computer Networks and Communications
Recommender Systems and Techniques
Physical Sciences →  Computer Science →  Information Systems
Peer-to-Peer Network Technologies
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.