Event-based cameras can capture changes in brightness in the form of asynchronous events, unlike traditional cameras, which has sparked tremendous interest due to their wide range of applications. In this work, we address for the first time in literature, the task of few-shot classification of event data without forgetting the base classes on which it has been initially trained. This not only relaxes the constraint of data availability from all possible classes before the initial model is trained, but also the constraint of capturing large amounts of training data for each of the classes we want to classify. The proposed framework has three main stages: First, we train the base classifier by augmenting the original event data using a data mixing technique, so that the feature extractor can better generalize to unseen classes. We also utilize an adaptive semantic similarity between the classifier weights. This guarantees that the margin between similar classes is greater than that between dissimilar classes which in turn reduces confusion between similar classes. Second, weight imprinting is employed to learn the initial classifier weights for the new classes with few examples. Finally, we finetune the entire framework using a class-imbalance aware loss in an end-to-end manner. This is accomplished by converting the event data via a series of differentiable operations, which are then fed into our network. Extensive experiments on few-shot versions of two standard event-camera datasets justify the effectiveness of the proposed framework. We believe that this study will serve as a solid foundation for future work in this critical field.
Zhibo FanYuchen MaZeming LiJian Sun
Jun Qing ChangDeepu RajanNicholas Vun
Yixin WuHui XueYuexuan AnPengfei Fang