In Natural Language Processing (NLP) the document summarization is an area that is getting interest of modern researchers. Though there are many techniques that have been proposed for English language but a few notable works have been done for Bangla text summarization. This paper deals with the development of an extraction based summarization technique which works on Bangla text documents. The system summarizes a single document at a time. Before creating the summary of a document, it is pre-processed by tokenization, removal of stop words and stemming. In the document summarization process, the countable features like word frequency and sentence positional value are used to make the summary more precise and concrete. Attributes like cue words and skeleton of the document are included in the process, which help to make the summary more relevant to the content of the document. The proposed technique has been compared with summary of documents generated by human professionals. The evaluation shows that 83.57% of summary sentences selected by the system agreed with those made by human.
S SudhakarAl-Falahi AhmedAishwarya Dhanaraj
Abhishek MahajaniVinay PandyaIsaac MariaDeepak Sharma