JOURNAL ARTICLE

Tiered sentence based topic model for multi-document summarization

Nadeem AkhtarM. M. Sufyan BegHira JavedMd. Muzakkir Hussain

Year: 2022 Journal:   Journal of Information and Optimization Sciences Vol: 43 (8)Pages: 2131-2141   Publisher: Taylor & Francis

Abstract

In this work, a probabilistic two level topic model named Tiered Sentence based Topic Model is proposed which models the document at sentence and word levels and infer hierarchical latent topics for sentences. The proposed model uses two latent variables for the generation of words- a super topic and a subtopic for each sentence of the document, to model word groupings at sentence level. Popular super topics identify general theme of the documents and are used for selecting summary sentences. The model parameters are used for ranking sentences considering sentence importance and topic coverage. Collapsed Gibbs sampling is used for inference and parameter estimation. The proposed model is used to compare with two sentence based topic models- SenLDA and LDCC on query focused multi-document summarization task, over standard DUC2005 dataset using ROUGE-1 and ROUGE-2 precision and recall scores. The proposed model performs better than Latent Dirichlet Allocation and SenLDA but has been outperformed by LDCC.

Keywords:
Automatic summarization Computer science Sentence Multi-document summarization Information retrieval Natural language processing Artificial intelligence

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
16
Refs
0.13
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.