JOURNAL ARTICLE

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

Ramakanth PasunuruAslı ÇelikyılmazMichel GalleyChenyan XiongYizhe ZhangMohit BansalJianfeng Gao

Year: 2021 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 35 (15)Pages: 13666-13674   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets. We present two QMDS training datasets, which we construct using two data augmentation methods: (1) transferring the commonly used single-document CNN/Daily Mail summarization dataset to create the QMDSCNN dataset, and (2) mining search-query logs to create the QMDSIR dataset. These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries. To cover both these real summary and query aspects, we build abstractive end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets. We also introduce new hierarchical encoders that enable a more efficient encoding of the query together with multiple documents. Empirical results demonstrate that our data augmentation and encoding methods outperform baseline models on automatic metrics, as well as on human evaluations along multiple attributes.

Keywords:
Computer science Automatic summarization Information retrieval Data mining Encoder Multi-document summarization Encoding (memory) Construct (python library) Baseline (sea) Artificial intelligence

Metrics

33
Cited By
3.68
FWCI (Field Weighted Citation Impact)
61
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Quality and Management
Social Sciences →  Decision Sciences →  Management Science and Operations Research

Related Documents

BOOK-CHAPTER

Selection Driven Query Focused Abstractive Document Summarization

Chudamani AryalYllias Chali

Lecture notes in computer science Year: 2020 Pages: 118-124
BOOK-CHAPTER

Query-Focused Multi-document Summarization

Jianfeng GaoChenyan XiongPaul N. BennettNick Craswell

˜The œinformation retrieval series Year: 2023 Pages: 71-88
JOURNAL ARTICLE

Query-Focused Multi-document Summarization Survey

Entesar AlanziSafa Alballaa

Journal:   International Journal of Advanced Computer Science and Applications Year: 2023 Vol: 14 (6)
© 2026 ScienceGate Book Chapters — All rights reserved.