JOURNAL ARTICLE

Unsupervised object discovery and localization in images and videos

Abstract

This paper addresses unsupervised discovery and localization of dominant objects from a noisy collection of images or videos. The setting of this problem is fully unsupervised, without even class labels or any assumption of a single dominant class, and thus far more general than those of typical colocalization or weakly-supervised localization tasks. Interestingly, our approach also discovers the topology of images/frames associated with instances of the same object class, a role normally left to supervisory information in the form of class labels in conventional image and video understanding methods. We tackle the discovery and localization problem using a part-based region matching approach: Off-the-shelf region proposals are extracted to form a set of candidate bounding boxes for objects and object parts, and these regions are effectively matched across images/frames. For each image/frame, a dominant object is localized by comparing the scores of candidate regions and selecting those that stand out over other regions containing them. Given a video collection, we also associate similar object regions along consecutive frames within the same video, thus achieving unsupervised tracking. Extensive experimental evaluations on standard benchmarks demonstrate that the proposed approach substantially outperforms the current state of the art in colocalization, and achieves robust object discovery in challenging mixed-class datasets.

Keywords:
Artificial intelligence Computer science Object (grammar) Class (philosophy) Computer vision Pattern recognition (psychology) Matching (statistics) Set (abstract data type) Video tracking Frame (networking) Bounding overwatch Object detection Image (mathematics) Minimum bounding box Mathematics

Metrics

7
Cited By
0.00
FWCI (Field Weighted Citation Impact)
16
Refs
0.11
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.