JOURNAL ARTICLE

Extractive Text Summarization for Ge'ez Language

Dejen Wuletaw

Year: 2024 Journal:   National Academic Digital Repository of Ethiopia

Abstract

The amount of text available online increasing rapidly, and making it difficult to find essential or relevant information from it. Due to the vast amount of information available, significant time and resources are spent trying to understand the relevant material. To address this issue, effective information reduction techniques are necessary to extract concise, accurate, and coherent summaries while preserving the original meaning. A helpful tool is automatic text summarization, which can provide a concise overview of a documents. To address the challenges of summarizing Ge’ez texts, this study proposed extractive summarization approach, the researchers proposed an improved method for automatic text summarization. They modified the existing research methodology by using term frequency-inverse document frequency (TF*IDF) and TextRank algorithms to create a more effective system that generates summaries by selecting important sentences from the original document or texts. A graph was created where sentences in the document were nodes and the connections between them (edges) represented similarity. The most important sentences were selected from the original document, and a summary was formed from its extracted sentence. To achieve this goal, the researcher prepared a comprised 120 Ge’ez text document of datasets, manually labeled by experts. Documents range from 101 to 1041 words, with an average length of 18 sentences and 256 words. The proposed method was tested by 25% and 30% extraction rate of prepared reference summaries (240 documents) and compared to standard methods like TextRank and TF*IDF algorithms, using precision, recall, and f1-score as evaluation measures. ROUGE-1, ROUGE-2, and ROUGE-L scores were calculated for all methods. The average f1-scores for experiment of TF*IDF algorithm were 72.5%, 59.2%, and 68.44% at evaluation metrics of ROUGE-1, ROUGE-2 and ROUGE-L respectively at 30% of extraction rate. For experiment of TextRank algorithm these scores were 79%, 65.67%, and 75% at evaluation metrics of ROUGE-1, ROUGE-2 and ROUGE-L respectively. The results of the experiment show that the proposed method can effectively summarize any type of documents irrespective of the category it belongs to.

Keywords:
Automatic summarization Keyword extraction Term (time) Information extraction Graph Range (aeronautics) Text graph

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.44
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Extractive Text Summarization for Ge'ez Language

Dejen Wuletaw

Journal:   National Academic Digital Repository of Ethiopia Year: 2024
BOOK-CHAPTER

Extractive Text Summarization for Azerbaijani Language

Mir Amir Pashayev

Communications in computer and information science Year: 2025 Pages: 351-356
JOURNAL ARTICLE

Extractive Text Summarization Using Formality of Language

Harsh MehtaSantosh Kumar BhartiNishant Doshi

Journal:   IEEE Open Journal of the Computer Society Year: 2025 Vol: 6 Pages: 1414-1425
JOURNAL ARTICLE

Extractive Text Summarization Models for Urdu Language

Ali NawazMaheen BakhtyarJunaid BaberIhsan UllahWaheed NoorAbdul Basit

Journal:   Information Processing & Management Year: 2020 Vol: 57 (6)Pages: 102383-102383
© 2026 ScienceGate Book Chapters — All rights reserved.