Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

Elad Amrani; Rami Ben‐Ari; Inbar Shapira; Tal Hakim; Alex Bronstein

doi:10.1109/cvprw50498.2020.00485

ScienceGate Book Chapters

JOURNAL ARTICLE

Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

Elad Amrani Rami Ben‐Ari Inbar Shapira Tal Hakim Alex Bronstein

Year: 2020 Pages: 4100-4108

DOI: 10.1109/cvprw50498.2020.00485

Get Full-Text PDF Get Analytical Report

Abstract

Learning an object detection or retrieval system requires a large data set with manual annotations. Such data are expensive and time-consuming to create and therefore difficult to obtain on a large scale. In this work, we propose using the natural correlation in narrations and the visual presence of objects in video to learn an object detector and retriever without any manual labeling involved. We pose the problem as weakly supervised learning with noisy labels, and propose a novel object detection and retrieval paradigm under these constraints. We handle the background rejection by using contrastive samples and confront the high level of label noise with a new clustering score. Our evaluation is based on a set of ten objects with manual ground truth annotation in almost 5000 frames extracted from instructional videos from the web. We demonstrate superior results compared to state-of-the-art weakly- supervised approaches and report a strongly-labeled upper bound as well. While the focus of the paper is object detection and retrieval, the proposed methodology can be applied to a broader range of noisy weakly-supervised problems.

Keywords:

Computer science Artificial intelligence Object (grammar) Ground truth Object detection Image retrieval Pattern recognition (psychology) Annotation Cluster analysis Supervised learning Set (abstract data type) Focus (optics) Noise (video) Computer vision Image (mathematics) Artificial neural network

Metrics

Cited By

0.84

FWCI (Field Weighted Citation Impact)

Refs

0.74

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

Abstract

Metrics

Citation History

Topics

Related Documents

Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Self-Supervised Object Detection from Egocentric Videos

Semi-supervised Object Detection with Unlabeled Data

Semi-supervised Object Detection with Unlabeled Data

Exploiting Unlabeled Videos for Video-Text Retrieval via Pseudo-Supervised Learning