Abstract

Recently, due to the unprecedented growth of multimedia data, cross-modal hashing has gained increasing attention for the efficient cross-media retrieval. Typically, existing methods on cross-modal hashing treat labels of one instance independently but overlook the correlations among labels. Indeed, in many real-world scenarios, like the online fashion domain, instances (items) are labeled with a set of categories correlated by certain hierarchy. In this paper, we propose a new end-to-end solution for supervised cross-modal hashing, named HiCHNet, which explicitly exploits the hierarchical labels of instances. In particular, by the pre-established label hierarchy, we comprehensively characterize each modality of the instance with a set of layer-wise hash representations. In essence, hash codes are encouraged to not only preserve the layer-wise semantic similarities encoded by the label hierarchy, but also retain the hierarchical discriminative capabilities. Due to the lack of benchmark datasets, apart from adapting the existing dataset FashionVC from fashion domain, we create a dataset from the online fashion platform Ssense consisting of 15,696 image-text pairs labeled by 32 hierarchical categories. Extensive experiments on two real-world datasets demonstrate the superiority of our model over the state-of-the-art methods.

Keywords:
Computer science Hash function Discriminative model Benchmark (surveying) Hierarchy Artificial intelligence Feature hashing Set (abstract data type) Domain (mathematical analysis) Modal Dynamic perfect hashing Universal hashing Machine learning Information retrieval Hash table Pattern recognition (psychology) Data mining Double hashing Mathematics

Metrics

52
Cited By
3.85
FWCI (Field Weighted Citation Impact)
47
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Supervised Hierarchical Online Hashing for Cross-modal Retrieval

Kai HanYu LiuRukai WeiKe ZhouJinhui XuKun Long

Journal:   ACM Transactions on Multimedia Computing Communications and Applications Year: 2023 Vol: 20 (4)Pages: 1-23
JOURNAL ARTICLE

Weakly-supervised Cross-modal Hashing

Xuanwu LiuGuoxian YuCarlotta DomeniconiJun WangGuoqiang XiaoMaozu Guo

Journal:   IEEE Transactions on Big Data Year: 2019 Pages: 1-1
JOURNAL ARTICLE

Semi-Supervised Online Cross-Modal Hashing

X. KangXingbo LiuXuening ZhangXue WenXiushan NieYilong Yin

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2025 Vol: 39 (17)Pages: 17770-17778
© 2026 ScienceGate Book Chapters — All rights reserved.