JOURNAL ARTICLE

A Hybrid Topic Model for Multi-Document Summarization

JinAn XuJiangming LiuKenji Araki

Year: 2015 Journal:   IEICE Transactions on Information and Systems Vol: E98.D (5)Pages: 1089-1094   Publisher: Institute of Electronics, Information and Communication Engineers

Abstract

Topic features are useful in improving text summarization. However, independency among topics is a strong restriction on most topic models, and alleviating this restriction can deeply capture text structure. This paper proposes a hybrid topic model to generate multi-document summaries using a combination of the Hidden Topic Markov Model (HTMM), the surface texture model and the topic transition model. Based on the topic transition model, regular topic transition probability is used during generating summary. This approach eliminates the topic independence assumption in the Latent Dirichlet Allocation (LDA) model. Meanwhile, the results of experiments show the advantage of the combination of the three kinds of models. This paper includes alleviating topic independency, and integrating surface texture and shallow semantic in documents to improve summarization. In short, this paper attempts to realize an advanced summarization system.

Keywords:
Automatic summarization Computer science Topic model Latent Dirichlet allocation Artificial intelligence Multi-document summarization Hidden Markov model Information retrieval Independence (probability theory) Transition (genetics) Document Structure Description Natural language processing XML World Wide Web

Metrics

5
Cited By
0.00
FWCI (Field Weighted Citation Impact)
22
Refs
0.03
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.