JOURNAL ARTICLE

Improving Chinese-Vietnamese Neural Machine Translation with Linguistic Differences

Zhiqiang YuZhengtao YuYantuan XianYuxin HuangJunjun Guo

Year: 2022 Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Vol: 21 (2)Pages: 1-12   Publisher: Association for Computing Machinery

Abstract

We present a simple, efficient data augmentation approach for boosting Chinese-Vietnamese neural machine translation performance by leveraging the linguistic difference between the two languages. We first define the formalized representation of modifier symmetry, which is one of the most representative linguistic differences between Chinese and Vietnamese. We then propose and test two data augmentation strategies for leveraging the linguistic difference, which can be integrated naturally with different translation models. Results indicate that both strategies can introduce linguistic rules to boost translation accuracy. Tests on Chinese-Vietnamese benchmarks show significant accuracy improvements. To facilitate studies in this domain, we also release an open-source toolkit 1 with flexible implementation for Chinese-Vietnamese linguistic difference tagging.

Keywords:
Vietnamese Machine translation Computer science Natural language processing Artificial intelligence Boosting (machine learning) Translation (biology) Linguistics Domain (mathematical analysis) Mathematics

Metrics

5
Cited By
0.98
FWCI (Field Weighted Citation Impact)
19
Refs
0.73
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.