Document-Level Neural Machine Translation With Document Embeddings

L. Zhu; Shu Jiang; Hai Zhao; Zuchao Li; Jiashuang Huang; Weiping Ding; Bao‐Liang Lu

doi:10.1109/access.2025.3568089

ScienceGate Book Chapters

JOURNAL ARTICLE

Document-Level Neural Machine Translation With Document Embeddings

L. Zhu Shu Jiang Hai Zhao Zuchao Li Jiashuang Huang Weiping Ding Bao‐Liang Lu

Year: 2025 Journal: IEEE Access Vol: 13 Pages: 87015-87025 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/access.2025.3568089

Get Full-Text PDF Get Analytical Report

Abstract

Standard neural machine translation (NMT) assumes that document-level context information is irrespective. Most existing document-level NMT methods are satisfied with a smattering sense of shallow document-level information, such as using a few context sentences surrounding the source sentence as document-level information. Our work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings, which can sufficiently model deeper and richer document-level context. The proposed document-level NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end. We compress the entire document text with explicit boundaries into a token-size global static document embedding, and the neighboring sentences as a token-size local dynamic document embedding and concatenate with the source tokens. Experiments reveal that the proposed method significantly improves the translation performance over strong baselines and other related studies.

Keywords:

Computer science Machine translation Natural language processing Transformer Baseline (sea) Artificial intelligence Context (archaeology) Translation (biology) Information retrieval Engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Document-Level Neural Machine Translation With Document Embeddings

Abstract

Metrics

Citation History

Topics

Related Documents

Rethinking Document-level Neural Machine Translation

Predicting Machine Translation Adequacy with Document Embeddings

Context-Adaptive Document-Level Neural Machine Translation

Document-Level Adaptation for Neural Machine Translation

Document-Level Neural Machine Translation With Recurrent Context States