JOURNAL ARTICLE

Semi-Supervised Hashing for Large-Scale Search

Jun WangSanjiv KumarShih‐Fu Chang

Year: 2012 Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Vol: 34 (12)Pages: 2393-2406   Publisher: IEEE Computer Society

Abstract

Hashing-based approximate nearest neighbor (ANN) search in huge databases has become popular due to its computational and memory efficiency. The popular hashing methods, e.g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions based on random or principal projections. The resulting hashes are either not very accurate or are inefficient. Moreover, these methods are designed for a given metric similarity. On the contrary, semantic similarity is usually given in terms of pairwise labels of samples. There exist supervised hashing methods that can handle such semantic similarity, but they are prone to overfitting when labeled data are small or noisy. In this work, we propose a semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled sets. Based on this framework, we present three different semi-supervised hashing methods, including orthogonal hashing, nonorthogonal hashing, and sequential hashing. Particularly, the sequential hashing method generates robust codes in which each hash function is designed to correct the errors made by the previous ones. We further show that the sequential learning paradigm can be extended to unsupervised domains where no labeled pairs are available. Extensive experiments on four large datasets (up to 80 million samples) demonstrate the superior performance of the proposed SSH methods over state-of-the-art supervised and unsupervised hashing techniques.

Keywords:
Locality-sensitive hashing Hash function Dynamic perfect hashing Computer science Universal hashing Feature hashing Overfitting Pattern recognition (psychology) Artificial intelligence K-independent hashing Nearest neighbor search Linear hashing Pairwise comparison Metric (unit) Hash table Double hashing Machine learning

Metrics

846
Cited By
53.64
FWCI (Field Weighted Citation Impact)
55
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Generalized Debiased Semi-Supervised Hashing for Large-Scale Image Retrieval

Xingbo LiuXuening ZhangXiushan NieYang ShiYilong Yin

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2025 Vol: 39 (6)Pages: 5631-5639
JOURNAL ARTICLE

SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval

Jian ZhangYuxin Peng

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2017 Vol: 29 (1)Pages: 212-225
JOURNAL ARTICLE

Efficient Supervised Discrete Multi-View Hashing for Large-Scale Multimedia Search

Xu LuLei ZhuJingjing LiHuaxiang ZhangHeng Tao Shen

Journal:   IEEE Transactions on Multimedia Year: 2019 Vol: 22 (8)Pages: 2048-2060
© 2026 ScienceGate Book Chapters — All rights reserved.