JOURNAL ARTICLE

Topic-Aware Modeling for Unsupervised Extractive Summarization

Abstract

The recent success of extractive summarization depends on the availability of large-scale annotated datasets. Existing unsupervised approaches are mostly directed graph based by combining location information with centrality computing. These methods tend to generate summaries with two problems, one is low topic coverage of the source document called the facet bias problem, and the other is continuous position distribution of extracted sentences called the position bias problem. To solve these problems, we propose the topic-aware centrality-based sum-marization method (TACSUM). Specifically, we employ clustering techniques to explicitly model the topics of the document and define the metrics for topic consistency and topic coverage to improve the performance of summarization. The metric topic consistency is used to guide the calculation of centrality, which solves the position bias problem and achieves a more general effect in different scenarios. We combine the metric topic coverage with the centrality to enhance the topic awareness of the model, which ensures the selected sentences are important and diverse. Numerical experimental results on four datasets show that our method outperforms previous unsupervised methods, especially in long document domains. Extensive analyses confirm that our method can generate high-quality summaries by eliminating position bias and facet bias problems.

Keywords:
Automatic summarization Computer science Centrality Metric (unit) Consistency (knowledge bases) Cluster analysis Topic model Data mining Graph Information retrieval Position (finance) Artificial intelligence Machine learning Theoretical computer science Mathematics

Metrics

3
Cited By
0.77
FWCI (Field Weighted Citation Impact)
36
Refs
0.72
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.