DISSERTATION

Neural text summarization with fine-grained control

Xu, Jiacheng

Year: 2022 University:   Texas Digital Library (University of Texas)   Publisher: The University of Texas at Austin

Abstract

Recently, neural network-based approaches have pushed the performance of both extractive and abstractive text summarization models to new heights. Despite these advances, the black-box nature of neural network based text summarization models makes them hard to interpret and control. We cannot force a model to include or exclude certain pieces of information, nor can we even guarantee that everything it includes is factual. Understanding the model's mechanism and designing methods to control its behavior are the important pieces missing in the current paradigm. In this dissertation, we aim to build summarization systems with fine-grained control. Fine-grained control allows us to understand, assess, and manipulate small pieces of the output of summarization models. By structuring or analyzing the generation of summaries as these small pieces, we can explore paraphrase possibilities or precisely correct factual errors. More specifically, we want (a) fine granularity of spans that can be selected or removed in extractive and abstractive systems; (b) a deep understanding of how the model works so we can make the generation process more transparent and interpretable, to further guide the model better; (c) a more flexible and powerful decoding scheme improving the diversity and quality of generated texts. For extractive systems, we build compressive summarization models to remove undesired spans in the sentences selected by the extractive model. For abstractive systems, we start with a descriptive analysis of the model's generation by measuring and comparing the entropies of different generation steps. Then, we propose a two-stage framework to fully interpret the step-wise prediction decisions of neural abstractive summarization models. We conduct a comprehensive evaluation on commonly used attribution methods to assess their ability to locate and attribute content from the input. Finally, we present a search algorithm to construct lattices encoding a massive number of generation options for text summarization and machine translation. Two key components, modified best-first search and hypothesis recombination, are developed to fulfill the goal. Our approach with the introduced lattice structure encodes more high-quality candidates than baselines methods, with significantly higher overlap with annotated reference generations. Having these models, tools, and algorithms, we systematically gain more control over smaller segments and interpretable units of the generation process. This fine-grained analysis and construction of text generation systems can enable developers and users to select from, filter, calibrate, and examine models' output, empowering a range of possible applications.

Keywords:
Automatic summarization Paraphrase Granularity Artificial neural network Process (computing) Control (management) Natural language generation

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

DISSERTATION

Fine-grained evaluation for text summarization

Goyal, Tanya (Ph. D. In Computer Science)

University:   Texas Digital Library (University of Texas) Year: 2023
JOURNAL ARTICLE

Fine Grained Spoken Document Summarization Through Text Segmentation

Samantha KoteyRozenn DahyotNaomi Harte

Journal:   2022 IEEE Spoken Language Technology Workshop (SLT) Year: 2023 Vol: 33 Pages: 647-654
BOOK-CHAPTER

Abstractive Text Summarization with Fine-Tuned Transformer

Venkata Phanidra Kumar SiginamsettyAshu Abdul

Lecture notes in electrical engineering Year: 2023 Pages: 587-596
© 2026 ScienceGate Book Chapters — All rights reserved.