Hashing for large-scale multimedia is a popular research topic, attracting much attention in computer vision and visual information retrieval. Previous works mostly focus on hashing the images and texts while the approaches designed for videos are limited. In this paper, we propose a \textit{Supervised Recurrent Hashing} (SRH) that explores the discriminative representation obtained by deep neural networks to design hashing approaches. The long-short term memory (LSTM) network is deployed to model the structure of video samples. The max-pooling mechanism is introduced to embedding the frames into fixed-length representations that are fed into supervised hashing loss. Experiments on UCF-101 dataset demonstrate that the proposed method can significantly outperforms several state-of-the-art methods.
Yan XuFumin ShenXing XuLianli GaoYuan WangXiao Tan
Deming ZhaiXianming LiuXiangyang JiDebin ZhaoShin’ichi SatohWen Gao
Yingxin WangXiushan NieYang ShiXin ZhouYilong Yin