JOURNAL ARTICLE

Automated Bangla text summarization by sentence scoring and ranking

Abstract

In Natural Language Processing (NLP) the document summarization is an area that is getting interest of modern researchers. Though there are many techniques that have been proposed for English language but a few notable works have been done for Bangla text summarization. This paper deals with the development of an extraction based summarization technique which works on Bangla text documents. The system summarizes a single document at a time. Before creating the summary of a document, it is pre-processed by tokenization, removal of stop words and stemming. In the document summarization process, the countable features like word frequency and sentence positional value are used to make the summary more precise and concrete. Attributes like cue words and skeleton of the document are included in the process, which help to make the summary more relevant to the content of the document. The proposed technique has been compared with summary of documents generated by human professionals. The evaluation shows that 83.57% of summary sentences selected by the system agreed with those made by human.

Keywords:
Automatic summarization Computer science Multi-document summarization Natural language processing Lexical analysis tf–idf Artificial intelligence Bengali Sentence Information retrieval Ranking (information retrieval) Text graph Keyword extraction Natural language Word (group theory) Process (computing) Stop words Preprocessor Linguistics Term (time)

Metrics

34
Cited By
1.41
FWCI (Field Weighted Citation Impact)
7
Refs
0.87
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.