One‐stage self‐distillation guided knowledge transfer for long‐tailed visual recognition

Yuelong Xia; Shu Zhang; Jun Wang; Wei Zou; Juxiang Zhou; Bin Wen

doi:10.1002/int.23068

ScienceGate Book Chapters

JOURNAL ARTICLE

One‐stage self‐distillation guided knowledge transfer for long‐tailed visual recognition

Yuelong Xia Shu Zhang Jun Wang Wei Zou Juxiang Zhou Bin Wen

Year: 2022 Journal: International Journal of Intelligent Systems Vol: 37 (12)Pages: 11893-11908 Publisher: Wiley

DOI: 10.1002/int.23068

Get Full-Text PDF Get Analytical Report

Abstract

Deep learning has achieved remarkable progress for visual recognition on balanced data sets but still performs poorly on real-world long-tailed data distribution. The existing methods mainly decouple the problem into the two-stage decoupling training, that is, representation learning and classifier training, or multistage training based on knowledge distillation, thus resulting in huge training steps and extra computation cost. In this paper, we propose a conceptually simple yet effective One-stage Long-tailed Self-Distillation framework, called OLSD, which simultaneously takes representation learning and classifier training into one-stage training. For representation learning, we take two different sampling distributions and mixup them to input them into two branches, where the collaborative consistency loss is introduced to train network consistency, and we theoretically show that the proposed mixup naturally generates a tail-majority distribution mixup. For classifier training, we introduce balanced self-distillation guided knowledge transfer to improve generalization performance, where we theoretically show that proposed knowledge transfer implicitly minimizes not only cross-entropy but also KL divergence between head-to-tail and tail-to-head. Extensive experiments on long-tailed CIFAR10/100, ImageNet-LT and multilabel long-tailed VOC-LT demonstrate the proposed method's effectiveness.

Keywords:

Computer science Distillation Artificial intelligence Machine learning Classifier (UML) Transfer of learning Pattern recognition (psychology)

Metrics

Cited By

0.39

FWCI (Field Weighted Citation Impact)

Refs

0.61

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

One‐stage self‐distillation guided knowledge transfer for long‐tailed visual recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Attention-Guided Feature Distillation for Long-Tailed Visual Recognition

Self Supervision to Distillation for Long-Tailed Visual Recognition

BSDIB: Balanced Self-Distillation Information Bottleneck for Long-Tailed Visual Recognition

Balanced self-distillation for long-tailed recognition

KDTM: Multi-Stage Knowledge Distillation Transfer Model for Long-Tailed DGA Detection