Cross-modal retrieval aims to provide flexible retrieval results across different types of multimedia data. To confront with scalability issue, binary codes learning (a.k.a. hash technique) is advocated since it permits exact top-K retrieval with sub-linear time complexity. In this paper, we propose a new method called Semi-supervised Graph Convolutional Hashing network (SGCH), which tries to learn a common hamming space by preserving both intra-modality and intermodality similarities via an end-to-end neural network. On one hand, graph convolutional network is utilized to explore high-order intra-modality similarity, and simultaneously propagate the semantic information from labeled samples to unlabeled data. On the other hand, a siamese network is connected to project the learnt features into a common hamming space. To bridge the inter-modality gap, adversarial loss which aims to learn modality-independent features by confusing a modality classifier is incorporated into the overall loss function. Experimental evaluations on cross-media retrieval tasks demonstrate that SGCH performs competitively against the state-of-the-art methods.
Xiaobo ShenG.B. YuYinfan ChenXichen YangYuhui Zheng
Lei ZhangLeiting ChenWeihua OuChuan Zhou
Jiasheng DuanYadan LuoZiwei WangZi Huang
Ruiqing XuChao LiJunchi YanCheng DengXianglong Liu
Jinyu HuMingyong LiJiayan Zhang