JOURNAL ARTICLE

Pre-Trained Transformer-Based Models for Text Classification Using Low-Resourced Ewe Language

Abstract

Despite a few attempts to automatically crawl Ewe text from online news portals and magazines, the African Ewe language remains underdeveloped despite its rich morphology and complex "unique" structure. This is due to the poor quality, unbalanced, and religious-based nature of the crawled Ewe texts, thus making it challenging to preprocess and perform any NLP task with current transformer-based language models. In this study, we present a well-preprocessed Ewe dataset for low-resource text classification to the research community. Additionally, we have developed an Ewe-based word embedding to leverage the low-resource semantic representation. Finally, we have fine-tuned seven transformer-based models, namely BERT-based (cased and uncased), DistilBERT-based (cased and uncased), RoBERTa, DistilRoBERTa, and DeBERTa, using the preprocessed Ewe dataset that we have proposed. Extensive experiments indicate that the fine-tuned BERT-base-cased model outperforms all baseline models with an accuracy of 0.972, precision of 0.969, recall of 0.970, loss score of 0.021, and an F1-score of 0.970. This performance demonstrates the model’s ability to comprehend the low-resourced Ewe semantic representation compared to all other models, thus setting the fine-tuned BERT-based model as the benchmark for the proposed Ewe dataset.

Keywords:
Transformer Leverage (statistics) Computer science Language model Artificial intelligence Natural language processing Word embedding F1 score Embedding Benchmark (surveying) Geography Engineering Cartography

Metrics

9
Cited By
2.30
FWCI (Field Weighted Citation Impact)
46
Refs
0.88
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Religion and Sociopolitical Dynamics in Nigeria
Social Sciences →  Social Sciences →  General Social Sciences
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models

Josua KäserThomas NagyPatrick StirnemannThomas Hanne

Journal:   Computers, materials & continua/Computers, materials & continua (Print) Year: 2025 Vol: 83 (1)Pages: 201-217
JOURNAL ARTICLE

Pre-trained transformer-based language models for Sundanese

Wilson WongsoHenry LuckyDerwin Suhartono

Journal:   Journal Of Big Data Year: 2022 Vol: 9 (1)
JOURNAL ARTICLE

Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages

Koena Ronny MabokelaMpho PrimusTurgay Çelik

Journal:   Big Data and Cognitive Computing Year: 2024 Vol: 8 (11)Pages: 160-160
© 2026 ScienceGate Book Chapters — All rights reserved.