JOURNAL ARTICLE

Extractive Text Summarization Using Formality of Language

Harsh MehtaSantosh Kumar BhartiNishant Doshi

Year: 2025 Journal:   IEEE Open Journal of the Computer Society Vol: 6 Pages: 1414-1425   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Automatic text summarization has been a prominent research topic for over a decade, aiming to distill concise summaries from extensive textual documents. This study introduces a novel approach addressing the intricacies of morphologically rich Indo-Iranian languages. We propose a unique method that leverages linguistic formality to guide summary generation. Building on an existing formality formula designed for English, we adapt and extend it for the structural characteristics of Indo-Iranian languages, which follow the Subject-Object-Verb (SOV) order. Our refined formula demonstrates a 7.28% improvement in formality scores compared to informal texts, validated through statistical significance testing. To assess sentence formality, we use our custom formula alongside additional features such as Shannon entropy scores and numeric token presence, combining these into a comprehensive sentence evaluation metric. Using this framework, we generate extractive summaries of Gujarati texts. Comparative evaluations at 20% and 30% compression ratios reveal that our method outperforms existing baselines, with ROUGE-1 score improvements of 14.63% at 30% and 28.60% at 20% compression. For reproducibility and further exploration, all experimental data and source code are made publicly available.

Keywords:
Formality Automatic summarization Linguistics Natural language processing Computer science Psychology Philosophy

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
39
Refs
0.14
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Text Summarization Using Extractive Techniques for Indian Language

Manasi ChoukNeelam Phadnis

Journal:   International Journal of Computer Trends and Technology Year: 2021 Vol: 69 (6)Pages: 44-49
BOOK-CHAPTER

Extractive Text Summarization for Azerbaijani Language

Mir Amir Pashayev

Communications in computer and information science Year: 2025 Pages: 351-356
JOURNAL ARTICLE

Extractive Text Summarization for Ge'ez Language

Dejen Wuletaw

Journal:   National Academic Digital Repository of Ethiopia Year: 2024
JOURNAL ARTICLE

Extractive Text Summarization for Ge'ez Language

Dejen Wuletaw

Journal:   National Academic Digital Repository of Ethiopia Year: 2024
JOURNAL ARTICLE

Extractive Text Summarization Using Deep Learning for Tigrigna Language

Meresa Hiluf GebrehiwotMichael Melese

Journal:   International Journal on Data Science and Technology Year: 2023
© 2026 ScienceGate Book Chapters — All rights reserved.