JOURNAL ARTICLE

Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu

Abstract

In this study, we investigate the effectiveness of using cross-lingual word embeddings for zero-shot transfer learning between a language with an abundant resource, English, and a languagewith limited resource, isiZulu. IsiZulu is a part of the South African Nguni language family, which is characterised by complex agglutinating morphology. We use VecMap, an open source tool, to obtain cross-lingual word embeddings. To perform an extrinsic evaluation of the effectiveness of the embeddings, we train a news classifier on labelled English data in order to categorise unlabelled isiZulu data using zero-shot transfer learning. In our study, we found our model to have a weighted average F1-score of 0.34. Our findings demonstrate that VecMap generates modular word embeddings in the cross-lingual space that have an impact on the downstream classifier used for zero-shot transfer learning.

Keywords:
Computer science Natural language processing Artificial intelligence Classifier (UML) Transfer of learning Word (group theory) Embedding Linguistics

Metrics

1
Cited By
0.26
FWCI (Field Weighted Citation Impact)
30
Refs
0.56
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Computational and Text Analysis Methods
Social Sciences →  Social Sciences →  General Social Sciences
© 2026 ScienceGate Book Chapters — All rights reserved.