Encoder–Decoder Calibration for Multimodal Machine Translation

Turghun Tayir; Lin Li; Bei Li; Jianquan Liu; Kong Aik Lee

doi:10.1109/tai.2024.3354668

ScienceGate Book Chapters

JOURNAL ARTICLE

Encoder–Decoder Calibration for Multimodal Machine Translation

Turghun Tayir Lin Li Bei Li Jianquan Liu Kong Aik Lee

Year: 2024 Journal: IEEE Transactions on Artificial Intelligence Vol: 5 (8)Pages: 3965-3973 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tai.2024.3354668

Get Full-Text PDF Get Analytical Report

Abstract

The main purpose of multimodal machine translation is to improve the quality of translation results by taking the corresponding visual context as an additional input. Recently many studies in neural machine translation have attempted to obtain high-quality multimodal representation of encoder or decoder via attention mechanism. However, attention mechanism does not always accurately identify the decisive input for each prediction, which leads to an unsatisfactory multimodal information fusion. To this end, we propose an encoder-decoder calibration method which can automatically calibrate the image and text fusion representation in the encoder, and find the decisive input to the translation in the decoder. We validate our model on the multimodal machine translation dataset Multi30K. Experimental results show that our method significantly outperforms several recent baselines for both English–German and English–French translation tasks in terms of BLEU and METEOR.

Keywords:

Machine translation Computer science Encoder Translation (biology) Artificial intelligence BLEU Representation (politics) Context (archaeology) Speech recognition Natural language processing Computer vision

Metrics

Cited By

10.86

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Encoder–Decoder Calibration for Multimodal Machine Translation

Abstract

Metrics

Citation History

Topics

Related Documents

Zero-resource machine translation by multimodal encoder–decoder network with multimedia pivot

Is Encoder-Decoder Redundant for Neural Machine Translation?

M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

Machine Translation Considering Context Informaiton Using Encoder-Decoder Model