Self-Supervised Video Similarity Learning

Giorgos Kordopatis-Zilos; Giorgos Tolias; Christos Tzelepis; Ioannis Kompatsiaris; Ioannis Patras; Symeon Papadopoulos

doi:10.1109/cvprw59228.2023.00504

ScienceGate Book Chapters

JOURNAL ARTICLE

Self-Supervised Video Similarity Learning

Giorgos Kordopatis-Zilos Giorgos Tolias Christos Tzelepis Ioannis Kompatsiaris Ioannis Patras Symeon Papadopoulos

Year: 2023 Pages: 4756-4766

DOI: 10.1109/cvprw59228.2023.00504

Get Full-Text PDF Get Analytical Report

Abstract

We introduce S2VS, a video similarity learning approach with self-supervision. Self-Supervised Learning (SSL) is typically used to train deep models on a proxy task so as to have strong transferability on target tasks after fine-tuning. Here, in contrast to prior work, SSL is used to perform video similarity learning and address multiple retrieval and detection tasks at once with no use of labeled data. This is achieved by learning via instance-discrimination with task-tailored augmentations and the widely used InfoNCE loss together with an additional loss operating jointly on self-similarity and hard-negative similarity. We benchmark our method on tasks where video relevance is defined with varying granularity, ranging from video copies to videos depicting the same incident or event. We learn a single universal model that achieves state-of-the-art performance on all tasks, surpassing previously proposed methods that use labeled data. The code and pretrained models are publicly available at: https://github.com/gkordo/s2vs

Keywords:

Computer science Similarity (geometry) Artificial intelligence Benchmark (surveying) Granularity Task (project management) Machine learning Relevance (law) Source code Code (set theory) Labeled data Ranging Image (mathematics)

Metrics

Cited By

4.34

FWCI (Field Weighted Citation Impact)

105

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Self-Supervised Video Similarity Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

Self-supervised video representation learning

Similarity contrastive estimation for image and video soft contrastive self-supervised learning

3D-CSL: Self-Supervised 3D Context Similarity Learning for Near-Duplicate Video Retrieval

S4: Self-Supervised Learning of Spatiotemporal Similarity