JOURNAL ARTICLE

Perceptual Robust Hashing for Video Copy Detection with Unsupervised Learning

Abstract

In this paper, we propose an end-to-end perceptual robust hashing scheme for video copy detection based on unsupervised learning. Firstly, the spatio-temporal information in videos is effectively fused and condensed into high-dimensional features through a 3D self-attention, multi-scale feature fusion model based on 3D-CNN, in which the Inception block and the 3D self-attention mechanism are integrated. Then, we calculate the correlation distances between the extracted features to differentiate perceptual contents. Based on the similarity relationship, we can dynamically generate the pseudo-labels and exploit them to further guide the model training for video hash generation. In addition, we design the dual constraints to make the hash code obtain satisfactory robustness and discrimination. Extensive experiments demonstrate that the proposed scheme achieves superior performance of copy detection compared with existing schemes and performs well even in the case of untrained manipulations.

Keywords:
Computer science Hash function Robustness (evolution) Artificial intelligence Pattern recognition (psychology) Feature hashing Exploit Unsupervised learning Feature extraction Hash table Computer vision

Metrics

1
Cited By
0.18
FWCI (Field Weighted Citation Impact)
31
Refs
0.39
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.