CCIM: Cross-modal Cross-lingual Interactive Image Translation

Cong Ma; Yaping Zhang; Mei Tu; Yang Zhao; Yu Zhou; Chengqing Zong

doi:10.18653/v1/2023.findings-emnlp.330

ScienceGate Book Chapters

JOURNAL ARTICLE

CCIM: Cross-modal Cross-lingual Interactive Image Translation

Cong Ma Yaping Zhang Mei Tu Yang Zhao Yu Zhou Chengqing Zong

Year: 2023 Pages: 4959-4965

DOI: 10.18653/v1/2023.findings-emnlp.330

Get Full-Text PDF Get Analytical Report

Abstract

Text image machine translation (TIMT) which translates source language text images into target language texts has attracted intensive attention in recent years. Although the end-to-end TIMT model directly generates target translation from encoded text image features with an efficient architecture, it lacks the recognized source language information resulting in a decrease in translation performance. In this paper, we propose a novel Cross-modal Cross-lingual Interactive Model (CCIM) to incorporate source language information by synchronously generating source language and target language results through an interactive attention mechanism between two language decoders. Extensive experimental results have shown the interactive decoder significantly outperforms end-to-end TIMT models and has faster decoding speed with smaller model size than cascade models.

Keywords:

Computer science Machine translation Translation (biology) Modal Decoding methods Artificial intelligence Image (mathematics) Natural language processing Cascade Speech recognition Language model Algorithm

Metrics

Cited By

0.77

FWCI (Field Weighted Citation Impact)

Refs

0.74

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

CCIM: Cross-modal Cross-lingual Interactive Image Translation

Abstract

Metrics

Citation History

Topics

Related Documents

Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment

Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training

Cross-Lingual Text Image Recognition via Multi-Hierarchy Cross-Modal Mimic

CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer

Cross-lingual Cross-modal Pretraining for Multimodal Retrieval