Fine Grained Spoken Document Summarization Through Text Segmentation

Samantha Kotey; Rozenn Dahyot; Naomi Harte

doi:10.1109/slt54892.2023.10022829

ScienceGate Book Chapters

JOURNAL ARTICLE

Fine Grained Spoken Document Summarization Through Text Segmentation

Samantha Kotey Rozenn Dahyot Naomi Harte

Year: 2023 Journal: 2022 IEEE Spoken Language Technology Workshop (SLT) Vol: 33 Pages: 647-654

DOI: 10.1109/slt54892.2023.10022829

Get Full-Text PDF Get Analytical Report

Abstract

Podcast transcripts are long spoken documents of conversational dialogue. Challenging to summarize, podcasts cover a diverse range of topics, vary in length, and have uniquely different linguistic styles. Previous studies in podcast summarization have generated short, concise dialogue summaries. In contrast, we propose a method to generate long fine-grained summaries, which describe details of sub-topic narratives. Leveraging a readability formula, we curate a data subset to train a long sequence transformer for abstractive summarization. Through text segmentation, we filter the evaluation data and exclude specific segments of text. We apply the model to segmented data, producing different types of fine grained summaries. We show that appropriate filtering creates comparable results on ROUGE and serves as an alternative method to truncation. Experiments show our model outperforms previous studies on the Spotify podcast dataset when tasked with generating longer sequences of text.

Keywords:

Automatic summarization Computer science Readability Natural language processing Transformer Artificial intelligence Segmentation Plain text Information retrieval Filter (signal processing)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.03

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Text Readability and Simplification

Physical Sciences → Computer Science → Artificial Intelligence

Fine Grained Spoken Document Summarization Through Text Segmentation

Abstract

Metrics

Topics

Related Documents

Fine-grained evaluation for text summarization

Neural text summarization with fine-grained control

Toward Unifying Text Segmentation and Long Document Summarization

Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization

SPOKEN DOCUMENT RETRIEVAL AND SUMMARIZATION