Automatic text summarization is a method to shorten lengthy texts with important and understandable selected sentences using machine learning techniques. It is a part of Natural Language Processing (NLP) that uses machine learning techniques to generate accurate and non-redundant text. Manually, summarizing lengthy text is challenging for humans. So, it is required automatic text summarization. Automatic text summarization techniques reduce time and effort by automating the process of generating summaries. There are two types of text summarization for summarizing the data i.e. extractive and abstractive summarization techniques. In this paper, we have proposed extractive summarization for the Ho language, the first study of its kind. The Ho is a tribe as well as the language of India. For generating extractive summarization, we have used thirty 'Ho' language stories written in Warang Chiti Script. The stories were summarized using TF-IDF and TextRank algorithm and the generated summaries were manually evaluated on different parameters. After obtaining both machine and human-generated summaries, we compared both texts with the help of human experts. The average f1 score of TextRank is 15.17 and the average f1 score of TF-IDF is 17.56. The TF-IDF result is better than the TextRank result.
Ángel Hernández-CastañedaRené Arnulfo García-HernándezYulia LedenevaChristian Eduardo Millán-Hernández
Pooja GuptaSwati NigamRajiv Singh