JOURNAL ARTICLE

Unsupervised Word Segmentation Improves Dialectal Arabic to English Machine Translation

Abstract

We demonstrate the feasibility of using unsupervised morphological segmentation for dialects of Arabic, which are poor in linguistics resources.Our experiments using a Qatari Arabic to English machine translation system show that unsupervised segmentation helps to improve the translation quality as compared to using no segmentation or to using ATB segmentation, which was especially designed for Modern Standard Arabic (MSA).We use MSA and other dialects to improve Qatari Arabic to English machine translation, and we show that a uniform segmentation scheme across them yields an improvement of 1.5 BLEU points over using no segmentation.

Keywords:
Segmentation Computer science Artificial intelligence Machine translation Arabic Natural language processing Text segmentation Translation (biology) Modern Standard Arabic Word (group theory) Speech recognition Pattern recognition (psychology) Linguistics

Metrics

14
Cited By
1.93
FWCI (Field Weighted Citation Impact)
40
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and dialogue systems
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.