Recently, neural network-based approaches have pushed the performance of both extractive and abstractive text summarization models to new heights. Despite these advances, the black-box nature of neural network based text summarization models makes them hard to interpret and control. We cannot force a model to include or exclude certain pieces of information, nor can we even guarantee that everything it includes is factual. Understanding the model's mechanism and designing methods to control its behavior are the important pieces missing in the current paradigm. In this dissertation, we aim to build summarization systems with fine-grained control. Fine-grained control allows us to understand, assess, and manipulate small pieces of the output of summarization models. By structuring or analyzing the generation of summaries as these small pieces, we can explore paraphrase possibilities or precisely correct factual errors. More specifically, we want (a) fine granularity of spans that can be selected or removed in extractive and abstractive systems; (b) a deep understanding of how the model works so we can make the generation process more transparent and interpretable, to further guide the model better; (c) a more flexible and powerful decoding scheme improving the diversity and quality of generated texts. For extractive systems, we build compressive summarization models to remove undesired spans in the sentences selected by the extractive model. For abstractive systems, we start with a descriptive analysis of the model's generation by measuring and comparing the entropies of different generation steps. Then, we propose a two-stage framework to fully interpret the step-wise prediction decisions of neural abstractive summarization models. We conduct a comprehensive evaluation on commonly used attribution methods to assess their ability to locate and attribute content from the input. Finally, we present a search algorithm to construct lattices encoding a massive number of generation options for text summarization and machine translation. Two key components, modified best-first search and hypothesis recombination, are developed to fulfill the goal. Our approach with the introduced lattice structure encodes more high-quality candidates than baselines methods, with significantly higher overlap with annotated reference generations. Having these models, tools, and algorithms, we systematically gain more control over smaller segments and interpretable units of the generation process. This fine-grained analysis and construction of text generation systems can enable developers and users to select from, filter, calibrate, and examine models' output, empowering a range of possible applications.
Goyal, Tanya (Ph. D. In Computer Science)
Samantha KoteyRozenn DahyotNaomi Harte
Venkata Phanidra Kumar SiginamsettyAshu Abdul