We consider a novel unsupervised learning setup in which training examples are grouped into small bundles that preserve an identity of an object. Such setup may practically arise when we are able to detect moving objects in videos without being able to classify their identity. Our approach is based on a construction of a similarity graph of bundles from which we are able to recover the identities of objects by applying a community detection algorithm. Finally, we train Siamese Neural Network to discriminate examples from different components and show that thus acquired representations produce well-separated clusters. Part of our contribution is also a unique dataset we assembled in order to test the presented idea.
Henrik SkibbeAlexandra TeynorHans Burkhardt
Dong ZhaoBaoqing DingYulin WuLei ChenHongchao Zhou
Leonhard SommerArtur JesslenEddy IlgAdam Kortylewski
Zhenheng YangPeng WangWei XuLiang ZhaoRamakant Nevatia