JOURNAL ARTICLE

MMSFT: Multilingual Multimodal Summarization by Fine-Tuning Transformers

Siginamsetty PhaniAshu AbdulM. Krishna Siva PrasadHiren Kumar Deva Sarma

Year: 2024 Journal:   IEEE Access Vol: 12 Pages: 129673-129689   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Multilingual multimodal (MM) summarization, involving the processing of multimodal input (MI) data across multiple languages to generate corresponding multimodal summaries (MS) using a single model, has been under explored. MI data consists of text and associated images, while MS incorporates text alongside relevant images aligned with the MI context. In this paper, we propose an MM summarization model by fine-tuning transformers (MMSFT), focusing on low-resource languages (LRLs) such as the Indian languages. MMSFT comprises multilingual learning for encoder training, incorporating multilingual attention with a forget gate mechanism, followed by MS generation using a decoder. In the proposed approach, we use publicly available multilingual multimodal summarization dataset (M3LS). Evaluation utilizing ROUGE metrics and the language-agnostic target summary metric (LaTM) illustrates MMSFT’s significant enhancement over existing MM summarization models like mT5 and VG-mT5. Furthermore, MMSFT yields better or equivalent summaries compared to existing MM summarization models trained separately for each language. Human and statistical evaluation reveal MMSFT’s significant improvement over existing models, with a p-value $\leq 0.05$ in paired t-tests.

Keywords:
Automatic summarization Computer science Transformer Natural language processing Artificial intelligence Electrical engineering Engineering Voltage

Metrics

5
Cited By
3.19
FWCI (Field Weighted Citation Impact)
70
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and dialogue systems
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.