Pretrained, self-supervised vision transformers are revolutionizing the field of computer vision with their ability to learn useful features for downstream classification tasks without requiring labeled training data. This paper asks if these self-supervised techniques can also be used to transform the field of continuous learning. A fundamental challenge for continuous learning algorithms is to sequentially learn new tasks using just the new task data without degrading performance on the previously learned tasks. Sequential finetuning of a neural network's backbone while learning a new classification task often leads to overfitting the network's weights to new class and altering and degrading its performance on previously learned classes.This paper introduces a new approach that joins a pretrained, self-supervised vision transformer with an incremental learning technique called eXtending Rapid Class Augmentation (XRCA). The XRCA method is distinct with its recursive memory and classifier-based incremental learning approach. This approach is shown to learn a new classification task extremely rapidly and in a manner that jointly optimizes over both old and new classes using just the new class data. This paper examines the coupling this classifier-focused incremental learning approach with a pretrained, self-supervised, feature extraction backbone. This new self-supervised approach is compared to those that use pretrained supervised features, finetuned features and domain-adapted features. The results indicate a promising new direction for continuous learning algorithms that utilize self-supervision's ability to generalize to new classes with a recursive, classifier-centric approach to incremental learning.
Caisheng LiaoYuki TodoZheng Tang
Olena StankevychDanylo Matviikiv
Sukmin YunHankook LeeJaehyung KimJinwoo Shin