Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation

Laura Mascarell

doi:10.18653/v1/w17-4813

ScienceGate Book Chapters

JOURNAL ARTICLE

Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation

Laura Mascarell

Year: 2017

DOI: 10.18653/v1/w17-4813

Get Full-Text PDF Get Analytical Report

Abstract

Currently under review for EMNLP 2017 The phrase-based Statistical Machine Translation (SMT) approach deals with sentences in isolation, making it difficult to consider discourse context in translation. This poses a challenge for ambiguous words that need discourse knowledge to be correctly translated. We propose a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder. We focus on word embeddings to deal with the lexical chains, contrary to the traditional approach that uses lexical resources. Experimental results on German-to-English show that our method produces correct translations in up to 88% of the changes, improving the translation in 36%-48% of them over the baseline.

Keywords:

Computer science Natural language processing Machine translation Phrase Artificial intelligence Word (group theory) Context (archaeology) Evaluation of machine translation Translation (biology) Example-based machine translation Similarity (geometry) Focus (optics) German Transfer-based machine translation Machine translation software usability Rule-based machine translation Linguistics

Metrics

Cited By

2.52

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Text Readability and Simplification

Physical Sciences → Computer Science → Artificial Intelligence

Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation

Abstract

Metrics

Citation History

Topics

Related Documents

Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation

Document-Level Neural Machine Translation With Document Embeddings

Lexical Chain Based Cohesion Models for Document-Level Statistical Machine Translation

Modeling Consistency Preference via Lexical Chains for Document-level Neural Machine Translation

Encouraging Lexical Translation Consistency for Document-Level Neural Machine Translation