JOURNAL ARTICLE

Random forest classifier based multi-document summarization system

Abstract

In the recent times, the requirement for generation of multi-document summary has gained a lot of attention among the researchers due to the information explosion in the web media. Mostly, the text summarization technique uses the sentence extraction technique where the salient sentences in the multiple documents are extracted and presented as a summary. In our proposed system, we have developed a random forest classifier based multi-document summarization system that differentiates the sentences in the multiple documents as one belonging to the summary or not belonging to the summary. For this each sentence in the documents is represented by a set of feature scores. Classifier is trained using feature scores and summary information of each sentence in the document set. Feature scores of sentences of multiple documents to be summarized are given as the test document for the classifier. From the output of the classifier, sentences that belonging to the summary class, a required size summary is generated using Maximal Marginal Relevance. The experiments are conducted using the DUC 2002 dataset and its corresponding summary. Experimental results show the quality of the summary generated by this method is good in terms of relevance and novelty.

Keywords:
Automatic summarization Computer science Classifier (UML) Sentence Novelty Artificial intelligence Salient Random forest Multi-document summarization Natural language processing Naive Bayes classifier Feature extraction Information retrieval Pattern recognition (psychology) Support vector machine

Metrics

22
Cited By
0.47
FWCI (Field Weighted Citation Impact)
31
Refs
0.77
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Static video summarization using multi-CNN with sparse autoencoder and random forest classifier

Madhu S. NairJesna Mohan

Journal:   Signal Image and Video Processing Year: 2020 Vol: 15 (4)Pages: 735-742
JOURNAL ARTICLE

Web-Based Dynamic Multi-Document Summarization System Framework

Meiling LiuHonge RenYu YangDequan ZhengTiejun Zhao

Journal:   Journal of Software Year: 2013 Vol: 24 (5)Pages: 1006-1021
JOURNAL ARTICLE

Event-based Multi-document Summarization

dos Santos Marujo, Luis

Journal:   OPAL (Open@LaTrobe) (La Trobe University) Year: 2025
© 2026 ScienceGate Book Chapters — All rights reserved.