JOURNAL ARTICLE

Mask Guided Knowledge Distillation for Single Shot Detector

Abstract

In this paper, we explore the idea of distilling small networks for object detection task. More specifically, we propose a two-stage approach to learn more compact and efficient detectors under the single-shot object detection framework by leveraging knowledge distillation. During the 1st stage, we learn the feature maps of the student model for each of the prediction head from the teacher model. Instead of fitting the whole feature map directly, here we propose the mask guided structure including not only the entire feature map (i.e. global features) but also region features covered by the object (i.e. local features), which can significantly improve the performance of the student network. For the 2nd stage, the ground-truth is used to further refine the performance. Experimental results on PASCAL VOC and KITTI dataset demonstrate the effectiveness of our proposed approach. We achieve 56.88% mAP on VOC2007 at 143 FPS with the backbone of 1/8 VGG16.

Keywords:
Pascal (unit) Computer science Single shot Artificial intelligence Object detection Detector Feature (linguistics) Feature extraction Pattern recognition (psychology) Shot (pellet) Computer vision Object (grammar)

Metrics

13
Cited By
1.07
FWCI (Field Weighted Citation Impact)
29
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.