JOURNAL ARTICLE

Data augmentation using back-translation for context-aware neural machine translation

Abstract

A single sentence does not always convey information that is enough to translate it into other languages. Some target languages need to add or specialize words that are omitted or ambiguous in the source languages (e.g, zero pronouns in translating Japanese to English or epicene pronouns in translating English to French). To translate such ambiguous sentences, we need contexts beyond a single sentence, and have so far explored context-aware neural machine translation (NMT). However, a large amount of parallel corpora is not easily available to train accurate context-aware NMT models. In this study, we first obtain large-scale pseudo parallel corpora by back-translating monolingual data, and then investigate its impact on the translation accuracy of context-aware NMT models. We evaluated context-aware NMT models trained with small parallel corpora and the large-scale pseudo parallel corpora on English-Japanese and English-French datasets to demonstrate the large impact of the data augmentation for context-aware NMT models.

Keywords:
Computer science Machine translation Parallel corpora Natural language processing Artificial intelligence Sentence Context (archaeology) Translation (biology) Scale (ratio)

Metrics

80
Cited By
3.69
FWCI (Field Weighted Citation Impact)
29
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Syntax-Aware Data Augmentation for Neural Machine Translation

Sufeng DuanHai ZhaoDongdong Zhang

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2023 Vol: 31 Pages: 2988-2999
JOURNAL ARTICLE

Context-Aware Neural Machine Translation using Selected Context

Sami Ul HaqSadaf Abdul RaufArslan ShaukatMuhammad Hassan Arif

Journal:   2022 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST) Year: 2022 Pages: 349-352
JOURNAL ARTICLE

Context-aware neural machine translation

Herold, Christian

Journal:   RWTH Publications (RWTH Aachen) Year: 2024
© 2026 ScienceGate Book Chapters — All rights reserved.