Recent years have witnessed a dramatic increase in the number of community-contributed images. Hashing-based similarity searches for social images have been attracting considerable interest from computer vision and multimedia communities due to their computational and memory efficiency. In this paper, we propose a novel weakly supervised hashing method named weakly supervised multimodal hashing, for scalable social image retrieval. Semantic-aware hash functions are learned by jointly leveraging the weakly supervised tag information and visual information. Specifically, because user-provided tags associated with social images can describe the semantic information, the hash functions are learned by exploring the semantic structure. Unfortunately, the user-provided tags are imperfect. To avoid overfitting the weakly supervised tags, the local discriminative structure and the geometric structure in the visual space are explored. Besides, to learn compact and non-redundant hash codes, the hash functions are constrained to be orthogonal and an information theoretic regularization based on the maximum entropy principle is introduced to maximize the information provided by each hash code. The learned hash functions are orthogonal, which can avoid redundancy in the learned hash codes as much as possible. The proposed hashing learning problem is formulated as the eigenvalue problem, which can be solved efficiently. Extensive experiments are conducted on two widely used social image data sets and the encouraging performance compared with the state-of-the-art hashing techniques demonstrates the effectiveness of the proposed method.
Zechao LiJinhui TangLiyan ZhangJian Yang
Jun WangSanjiv KumarShih‐Fu Chang
Hui CuiLei ZhuChaoran CuiXiushan NieHuaxiang Zhang
Ziyu GuanFei XieWanqing ZhaoXiaopeng WangLong ChenWei ZhaoJinye Peng