JOURNAL ARTICLE

Dual Word Embedding for Robust Unsupervised Bilingual Lexicon Induction

Hailong CaoLiguo LiConghui ZhuMuyun YangTiejun Zhao

Year: 2023 Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Vol: 31 Pages: 2606-2615   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The word embedding models such as Word2vec and FastText simultaneously learn dual representations of input vectors and output vectors. In contrast, almost all existing unsupervised bilingual lexicon induction (UBLI) methods use only input vectors without utilizing output vectors. In this paper, we propose a novel approach to making full use of both input and output vectors for more robust and strong UBLI. We discover the Common Difference Property that one orthogonal transformation can connect not only the input vectors of two languages but also the output vectors. Therefore, we can learn just one transformation to induce two different dictionaries from the input and output vectors, respectively. Between these two quite different dictionaries, a more accurate lexicon with less noise can be induced by taking the intersection of them in UBLI procedure. Extensive experiments show that our method achieves much more robust and strong results than state-of-the-art methods in distant language pairs, while reserving comparable performances in similar language pairs.

Keywords:
Computer science Word2vec Lexicon Word (group theory) Word embedding Dual (grammatical number) Artificial intelligence Orthogonal transformation Transformation (genetics) Embedding Intersection (aeronautics) Property (philosophy) Natural language processing Pattern recognition (psychology) Algorithm Mathematics Linguistics

Metrics

2
Cited By
0.51
FWCI (Field Weighted Citation Impact)
37
Refs
0.65
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Bilingual word embedding fusion for robust unsupervised bilingual lexicon induction

Hailong CaoTiejun ZhaoWeixuan WangWei Peng

Journal:   Information Fusion Year: 2023 Vol: 97 Pages: 101818-101818
JOURNAL ARTICLE

Structure-Aware Dual Adversarial Autoencoder for Unsupervised Bilingual Lexicon Induction

Qian TaoZiyan LiBocheng HanLusi Li

Journal:   IEEE Transactions on Audio Speech and Language Processing Year: 2025 Vol: 33 Pages: 4771-4786
JOURNAL ARTICLE

Reshaping Word Embedding Space With Monolingual Synonyms for Bilingual Lexicon Induction

Qiuyu DingHailong CaoConghui ZhuTiejun Zhao

Journal:   IEEE Transactions on Audio Speech and Language Processing Year: 2025 Vol: 33 Pages: 785-796
© 2026 ScienceGate Book Chapters — All rights reserved.