Cross-lingual transfer for low-resource Arabic language understanding

Khadige Abboud; Olga Golovneva; Christopher DiPersio

doi:10.18653/v1/2022.wanlp-1.21

ScienceGate Book Chapters

JOURNAL ARTICLE

Cross-lingual transfer for low-resource Arabic language understanding

Khadige Abboud Olga Golovneva Christopher DiPersio

Year: 2022 Pages: 225-237

DOI: 10.18653/v1/2022.wanlp-1.21

Get Full-Text PDF Get Analytical Report

Abstract

This paper explores cross-lingual transfer learning in natural language understanding (NLU), with the focus on bootstrapping Arabic from high-resource English and French languages for domain classification, intent classification, and named entity recognition tasks. We adopt a BERT-based architecture and pretrain three models using open-source Wikipedia data and large-scale commercial datasets: monolingual:Arabic, bilingual:Arabic-English, and trilingual:Arabic-English-French models. Additionally, we use off-the-shelf machine translator to translate internal data from source English language to the target Arabic language, in an effort to enhance transfer learning through translation. We conduct experiments that finetune the three models for NLU tasks and evaluate them on a large internal dataset. Despite the morphological, orthographical, and grammatical differences between Arabic and the source languages, transfer learning performance gains from source languages and through machine translation are achieved on a real-world Arabic test dataset in both a zero-shot setting and in a setting when the models are further finetuned on labeled data from the target language.

Keywords:

Computer science Natural language processing Artificial intelligence Bootstrapping (finance) Transfer of learning Machine translation Arabic Modern Standard Arabic Linguistics

Metrics

Cited By

0.59

FWCI (Field Weighted Citation Impact)

Refs

0.69

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Cross-lingual transfer for low-resource Arabic language understanding

Abstract

Metrics

Citation History

Topics

Related Documents

Debiasing Low-Resource Language Models Via Cross-Lingual Transfer Learning

Cross-lingual Transfer Learning for Low-Resource Natural Language Processing Tasks

Investigating Zero-shot Cross-lingual Language Understanding for Arabic

Cross-lingual Transfer Learning for Spoken Language Understanding

Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing