JOURNAL ARTICLE

Measuring importance and query relevance in topic-focused multi-document summarization

Abstract

The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.

Keywords:
Automatic summarization Computer science Relevance (law) Information retrieval Multi-document summarization Set (abstract data type) Context (archaeology) Identification (biology) Word (group theory) Mathematics

Metrics

51
Cited By
4.27
FWCI (Field Weighted Citation Impact)
9
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Algorithms and Data Compression
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Query-Focused Multi-document Summarization

Jianfeng GaoChenyan XiongPaul N. BennettNick Craswell

˜The œinformation retrieval series Year: 2023 Pages: 71-88
JOURNAL ARTICLE

Query-Focused Multi-document Summarization Survey

Entesar AlanziSafa Alballaa

Journal:   International Journal of Advanced Computer Science and Applications Year: 2023 Vol: 14 (6)
© 2026 ScienceGate Book Chapters — All rights reserved.