JOURNAL ARTICLE

Indonesian Abstractive Summarization using Pre-trained Model

Abstract

Automatic text summarization systems are increasingly needed to encounter the information explosion caused by internet growth. Since Indonesian is still considered an under-resourced language, we take advantage of pre-trained language models to perform abstractive summarization. This paper investigates the BERT performance given the Indonesian article by comparing several BERT pre-trained models and evaluated the results based on the ROUGE values. Our experiment shows that an English pre-trained model can produce a good summary given Indonesian text, but it is more effective for using the Indonesian pre-trained model. The default training model only with the abstractive objective is better than using two-stage fine-tuning, where the extractive model must be trained in advance. We also found a lot of meaningless words in the summary words construction. This finding is the result of a preliminary study to improve the Indonesian abstractive summarization model.

Keywords:
Automatic summarization Indonesian Computer science Natural language processing Artificial intelligence Language model The Internet Information retrieval Linguistics World Wide Web

Metrics

14
Cited By
1.69
FWCI (Field Weighted Citation Impact)
23
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.