JOURNAL ARTICLE

DLSTM approach to video modeling with hashing for large-scale video retrieval

Abstract

Although Query-by-Example techniques based on Euclidean distance in a multidimensional feature space have proved to be effective for image databases, this approach cannot be effectively applied to video since the number of dimensions would be massive due to the richness and complexity of video data. The above issue has been addressed in two recent solutions, namely Deterministic Quantization (DQ) and Dynamic Temporal Quantization (DTQ). DQ divides the video into equal segments and extracts a visual feature vector for each segment. The bag-of-word feature is then encoded by hashing to facilitate approximate nearest neighbor search using Hamming distance. One weakness of this approach is the deterministic segmentation of video data. DTQ improves on this by using dynamic video segmentation to obtain varied-length video segments. As a result, feature vectors extracted from these video segments can better capture the semantic content of the video. To support very large video databases, it is desirable to minimize the number of segments in order to keep the size of the feature representation as small as possible. We achieve this by using only one video segment (i.e., no video data segmentation is even necessary) with even better retrieval performance. Our scheme models video using differential long short-term memory (DLSTM) recurrent neural networks and obtains a highly compact fixed-size feature representation with the output of hidden states of the DLSTM. Each of these features are further compressed by hashing them into binary bits via quantization. Experimental results based on two public data sets, UCF101 and MSRActionPairs, indicate that the proposed video modeling technique outperforms DTQ by a significant margin.

Keywords:
Computer science Artificial intelligence Pattern recognition (psychology) Feature vector Video tracking Nearest neighbor search Hamming space Hash function Segmentation Feature (linguistics) Vector quantization Search engine indexing Image retrieval Quantization (signal processing) Computer vision Video processing Algorithm Hamming code Image (mathematics)

Metrics

10
Cited By
0.84
FWCI (Field Weighted Citation Impact)
21
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Attention-Based Video Hashing for Large-Scale Video Retrieval

Yingxin WangXiushan NieYang ShiXin ZhouYilong Yin

Journal:   IEEE Transactions on Cognitive and Developmental Systems Year: 2019 Vol: 13 (3)Pages: 491-502
JOURNAL ARTICLE

Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval

Gengshen WuJungong HanYuchen GuoLi LiuGuiguang DingQiang NiLing Shao

Journal:   IEEE Transactions on Image Processing Year: 2018 Vol: 28 (4)Pages: 1993-2007
JOURNAL ARTICLE

Classification-enhancement deep hashing for large-scale video retrieval

Xiushan NieXin ZhouYang ShiJiande SunYilong Yin

Journal:   Applied Soft Computing Year: 2021 Vol: 109 Pages: 107467-107467
JOURNAL ARTICLE

Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video Retrieval

Yanbin HaoTingting MuRichang HongMeng WangNing AnJohn Y. Goulermas

Journal:   IEEE Transactions on Multimedia Year: 2016 Vol: 19 (1)Pages: 1-14
© 2026 ScienceGate Book Chapters — All rights reserved.