JOURNAL ARTICLE

Unsupervised Parallel Sentences of Machine Translation for Asian Language Pairs

Shaolin ZhuChenggang MiTianqi LiYong YangXu Chun

Year: 2022 Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Vol: 22 (3)Pages: 1-14   Publisher: Association for Computing Machinery

Abstract

Parallel sentence pairs play a very important role in many natural language processing tasks, especially cross-lingual tasks such as machine translation. So far, many Asian language pairs lack bilingual parallel sentences. As collecting bilingual parallel data is very time-consuming and difficult, it is very important for many low-resource Asian language pairs. While existing methods have shown encouraging results, they rely on bilingual data seriously or have some drawbacks in an unsupervised situation. To address these issues, we propose a new unsupervised similarity calculation and dynamic selection metric to obtain parallel sentence pairs in an unsupervised situation. First, our method maps bilingual word embedding by postdoc adversarial training, which rotates the source space to match the target without parallel data. Then, we introduce a new cross-domain similarity adaption to obtain parallel sentence pairs. Experimental results on real-world datasets show that our model can obtain better accuracy and recall on mining parallel sentence pairs. We also show that the extracted bilingual sentence corpora can significantly improve the performance of neural machine translation.

Keywords:
Computer science Machine translation Sentence Natural language processing Artificial intelligence Similarity (geometry) Parallel corpora Word (group theory) Metric (unit) Linguistics

Metrics

10
Cited By
1.96
FWCI (Field Weighted Citation Impact)
29
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.