WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval

Guanglong Xu; Zhensheng Hu; Jia Cai

doi:10.1142/s0219691322500540

ScienceGate Book Chapters

JOURNAL ARTICLE

WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval

Guanglong Xu Zhensheng Hu Jia Cai

Year: 2022 Journal: International Journal of Wavelets Multiresolution and Information Processing Vol: 21 (02) Publisher: World Scientific

DOI: 10.1142/s0219691322500540

Get Full-Text PDF Get Analytical Report

Abstract

Zero-shot sketch-based image retrieval (ZSSBIR) aims at retrieving natural images given free hand-drawn sketches that may not appear during training. Previous approaches used semantic aligned sketch-image pairs or utilized memory expensive fusion layer for projecting the visual information to a low-dimensional subspace, which ignores the significant heterogeneous cross-domain discrepancy between highly abstract sketch and relevant image. This may yield poor performance in the training phase. To tackle this issue and overcome this drawback, we propose a Wasserstein distance-based cross-modal semantic network (WAD-CMSN) for ZSSBIR. Specifically, it first projects the visual information of each branch (sketch, image) to a common low-dimensional semantic subspace via Wasserstein distance in an adversarial training manner. Furthermore, a novel identity matching loss is employed to select useful features, which can not only capture complete semantic knowledge, but also alleviate the over-fitting phenomenon caused by the WAD-CMSN model. Experimental results on the challenging Sketchy (Extended) and TU-Berlin (Extended) datasets indicate the effectiveness of the proposed WAD-CMSN model over several competitors.

Keywords:

Sketch Computer science Artificial intelligence Subspace topology Image (mathematics) Matching (statistics) Modal Semantic matching Pattern recognition (psychology) Algorithm Mathematics

Metrics

Cited By

0.62

FWCI (Field Weighted Citation Impact)

Refs

0.64

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval

Abstract

Metrics

Citation History

Topics

Related Documents

Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval

Hyperbolic-Based Cross-Modal Semantic Remodeling Network for Zero-Shot Sketch-Based Image Retrieval

Zero-shot sketch-based remote sensing image retrieval based on cross-modal fusion

Cross-Modal Relation and Sketch Prototype Learning for Zero-Shot Sketch-Based Image Retrieval

Cross-Domain Feature Semantic Calibration for Zero-Shot Sketch-Based Image Retrieval