Abstract

Word Embeddings (WE) are getting increasingly popular and widely applied in many Natural Language Processing (NLP) applications due to their effectiveness in capturing semantic properties of words; Machine Translation (MT), Information Retrieval (IR) and Information Extraction (IE) are among such areas. In this paper, we propose an open source ArbEngVec which provides several Arabic-English cross-lingual word embedding models. To train our bilingual models, we use a large dataset with more than 93 million pairs of Arabic-English parallel sentences. In addition , we perform both extrinsic and intrinsic evaluations for the different word embedding model variants. The extrinsic evaluation assesses the performance of models on the cross-language Semantic Textual Similarity (STS), while the intrinsic evaluation is based on the Word Translation (WT) task.

Keywords:
Arabic Computer science Natural language processing Word (group theory) Word embedding Embedding Artificial intelligence Speech recognition Linguistics Philosophy

Metrics

16
Cited By
1.54
FWCI (Field Weighted Citation Impact)
39
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text Readability and Simplification
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model

Raki LachrafEl Moatez Billah NagoudiYoucef AyachiAhmed AbdelalíDidier Schwab

Journal:   HAL (Le Centre pour la Communication Scientifique Directe) Year: 2019
BOOK-CHAPTER

Cross-Lingual Word Embedding Models: Typology

Anders SøgaardIvan VulićSebastian RuderManaal Faruqui

Synthesis lectures on human language technologies Year: 2019 Pages: 13-20
JOURNAL ARTICLE

Word Embedding for Cross-lingual Natural Language Analysis

Yukun Hu

Journal:   Highlights in Science Engineering and Technology Year: 2023 Vol: 68 Pages: 320-326
JOURNAL ARTICLE

Research of BERT Cross-Lingual Word Embedding Learning

WANG Yurong, LIN Min, LI Yanling

Journal:   DOAJ (DOAJ: Directory of Open Access Journals) Year: 2021
JOURNAL ARTICLE

A Survey of Cross-lingual Word Embedding Models

Sebastian RuderIvan VulićAnders Søgaard

Journal:   Apollo (University of Cambridge) Year: 2018
© 2026 ScienceGate Book Chapters — All rights reserved.