This paper proposes an adversarial hash cross-modal retrieval method based on a multimodal interactive attention mechanism. By constructing a cross-modal shared subspace representation learning network KGRU, the global semantic and local semantic information of each modality is extracted, and then introduce the attention mechanism into the model to further capture the key local feature information in different modalities. The experimental results show that this method can accurately mine the structural information and semantic correlation between modalities, and improve the accuracy of cross-modal retrieval. Improve computing time efficiency. The state-of-the-art effect is obtained in cross-modal retrieval experiments on the NUS-WIDE dataset.
Jian-qiong XiaoZhiyong ZhouXiaoqing Zhou
Xi ZhangHanjiang LaiJiashi Feng
Xiaoxiao WangMeiyu LiangXiaowen CaoJunping Du
Jian ZhangYuxin PengMingkuan Yuan