JOURNAL ARTICLE

Deep Hashing Similarity Learning for Cross-Modal Retrieval

Ying MaMeng WangGuangyun LuYajun Sun

Year: 2024 Journal:   IEEE Access Vol: 12 Pages: 8609-8618   Publisher: Institute of Electrical and Electronics Engineers

Abstract

In the realm of cross-modal retrieval research, hash methods have garnered significant attention from scholars due to their high retrieval efficiency and low storage costs. However, these methods often sacrifice a considerable amount of semantic features when mapping multi-modal characteristics to a low-dimensional space. Moreover, the focus of hash learning has primarily been on inter-modal similarity learning, neglecting the importance of intra-modal similarity learning. To address these issues, this paper proposes a novel cross-modal hash method called Deep Hashing Similarity Learning for Cross-modal Retrieval (DHSL). DHSL incorporates relation networks into the hash method, enabling pairwise matching between images and texts. This approach effectively bridges the heterogeneity gap between images and texts while simultaneously emphasizing the intra-modal similarity information within both modalities. The result is a hash similarity matrix that captures both inter-modal similarity and intra-modal discriminability. Considering that the process of transforming high-dimensional features into hash codes often leads to a loss of important semantic information, we introduce a feature selector to enhance the features. This selector filters out distinctive features from the original feature set and combines them with low-dimensional features to complement the semantic information. Moreover, we introduce weighted cosine triplet loss and quantization loss to constrain the hash representation in the Hamming space, thereby learning high-quality hash codes. Comprehensive experimental results on two benchmark datasets, NUS-WIDE and MIRFlickr25K, demonstrate that DHSL outperforms the state-of-the-art cross-modal hash methods.

Keywords:
Computer science Hash function Hamming space Artificial intelligence Double hashing Dynamic perfect hashing Feature hashing Pattern recognition (psychology) Feature learning Similarity (geometry) Hash table Algorithm Hamming code Image (mathematics)

Metrics

1
Cited By
0.53
FWCI (Field Weighted Citation Impact)
34
Refs
0.48
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.