JOURNAL ARTICLE

Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval

Hua YanYingyun YangJianhe Du

Year: 2020 Journal:   Electronics Vol: 9 (3)Pages: 466-466   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Multi-modal retrieval is a challenge due to heterogeneous gap and a complex semantic relationship between different modal data. Typical research map different modalities into a common subspace with a one-to-one correspondence or similarity/dissimilarity relationship of inter-modal data, in which the distances of heterogeneous data can be compared directly; thus, inter-modal retrieval can be achieved by the nearest neighboring search. However, most of them ignore intra-modal relations and complicated semantics between multi-modal data. In this paper, we propose a deep multi-modal metric learning method with multi-scale semantic correlation to deal with the retrieval tasks between image and text modalities. A deep model with two branches is designed to nonlinearly map raw heterogeneous data into comparable representations. In contrast to binary similarity, we formulate semantic relationship with multi-scale similarity to learn fine-grained multi-modal distances. Inter-modal and intra-modal correlations constructed on multi-scale semantic similarity are incorporated to train the deep model in an end-to-end way. Experiments validate the effectiveness of our proposed method on multi-modal retrieval tasks, and our method outperforms state-of-the-art methods on NUS-WIDE, MIR Flickr, and Wikipedia datasets.

Keywords:
Modal Computer science Similarity (geometry) Metric (unit) Artificial intelligence Semantics (computer science) Scale (ratio) Pattern recognition (psychology) Subspace topology Image retrieval Modality (human–computer interaction) Information retrieval Data mining Image (mathematics) Geography

Metrics

4
Cited By
0.31
FWCI (Field Weighted Citation Impact)
49
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Online Multi-Modal Distance Metric Learning with Application to Image Retrieval

Pengcheng WuSteven C. H. HoiPeilin ZhaoChunyan MiaoZhiyong Liu

Journal:   IEEE Transactions on Knowledge and Data Engineering Year: 2015 Vol: 28 (2)Pages: 454-467
JOURNAL ARTICLE

Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval

Jonathan MojooTakio Kurita

Journal:   IEICE Transactions on Information and Systems Year: 2021 Vol: E104.D (6)Pages: 873-880
JOURNAL ARTICLE

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

Yulei NiuZhiwu LuJi-Rong WenTao XiangShih‐Fu Chang

Journal:   IEEE Transactions on Image Processing Year: 2018 Vol: 28 (4)Pages: 1720-1731
JOURNAL ARTICLE

Multi-modal deep distance metric learning

Seyed Mahdi RoostaiyanEhsan ImaniMahdieh Soleymani Baghshah

Journal:   Intelligent Data Analysis Year: 2017 Vol: 21 (6)Pages: 1351-1369
© 2026 ScienceGate Book Chapters — All rights reserved.