JOURNAL ARTICLE

Learning Cross-Lingual Mappings in Imperfectly Isomorphic Embedding Spaces

Yuling LiKui YuYuhong Zhang

Year: 2021 Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Vol: 29 Pages: 2630-2642   Publisher: Institute of Electrical and Electronics Engineers

Abstract

One mainstream method in cross-lingual word embeddings is to learn a linear mapping between two monolingual embedding spaces using a training dictionary. Successful linear mappings require isomorphic embedding spaces. However, monolingual embedding spaces are not perfectly isomorphic, and therefore, a linear mapping cannot align them accurately. In this study, we assume that two embedding spaces are composed of near-isomorphic translation pairs (NearITP) and non-isomorphic translation pairs. Owing to the nature of similar substructures, NearITP can make linear mapping work well. Motivated by this, we design a screening strategy to identify NearITP effectively. Based on this strategy, we find that the proportion of NearITP in the commonly used training dictionary is relatively low, leading to sub-optimal results. To address this problem, we propose a general framework that can be combined with any of the mapping methods, which further boosts subsequent mapping. Experimental results demonstrate that our framework is an improvement over existing mapping-based methods, and outperforms state-of-the-art models on two public data sets. Moreover, we show that our framework can be successfully generalized to contextual word embeddings such as multilingual BERT (mBERT), and further enhances the cross-lingual properties of mBERT.

Keywords:
Embedding Translation (biology) Computer science Word (group theory) Word embedding Mathematics Theoretical computer science Artificial intelligence Natural language processing Geometry

Metrics

3
Cited By
0.42
FWCI (Field Weighted Citation Impact)
68
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Hate Speech and Cyberbullying Detection
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.