JOURNAL ARTICLE

WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval

Guanglong XuZhensheng HuJia Cai

Year: 2022 Journal:   International Journal of Wavelets Multiresolution and Information Processing Vol: 21 (02)   Publisher: World Scientific

Abstract

Zero-shot sketch-based image retrieval (ZSSBIR) aims at retrieving natural images given free hand-drawn sketches that may not appear during training. Previous approaches used semantic aligned sketch-image pairs or utilized memory expensive fusion layer for projecting the visual information to a low-dimensional subspace, which ignores the significant heterogeneous cross-domain discrepancy between highly abstract sketch and relevant image. This may yield poor performance in the training phase. To tackle this issue and overcome this drawback, we propose a Wasserstein distance-based cross-modal semantic network (WAD-CMSN) for ZSSBIR. Specifically, it first projects the visual information of each branch (sketch, image) to a common low-dimensional semantic subspace via Wasserstein distance in an adversarial training manner. Furthermore, a novel identity matching loss is employed to select useful features, which can not only capture complete semantic knowledge, but also alleviate the over-fitting phenomenon caused by the WAD-CMSN model. Experimental results on the challenging Sketchy (Extended) and TU-Berlin (Extended) datasets indicate the effectiveness of the proposed WAD-CMSN model over several competitors.

Keywords:
Sketch Computer science Artificial intelligence Subspace topology Image (mathematics) Matching (statistics) Modal Semantic matching Pattern recognition (psychology) Algorithm Mathematics

Metrics

5
Cited By
0.62
FWCI (Field Weighted Citation Impact)
21
Refs
0.64
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.