JOURNAL ARTICLE

Learning to Distill Convolutional Features into Compact Local Descriptors

Abstract

Extracting local descriptors or features is an essential step in solving image matching problems. Recent methods in the literature mainly focus on extracting effective descriptors, without much attention to the size of the descriptors. In this work, we study how to learn a compact yet effective local descriptor. The proposed method distills multiple intermediate features of a pretrained convolutional neural network to encode different levels of visual information from local textures to non-local semantics, resulting in local descriptors with a designated dimension. Experiments on standard benchmarks for semantic correspondence show that it achieves significantly improved performance over existing models, with up to a 100 times smaller size of descriptors. Furthermore, while trained on a small-sized dataset for semantic correspondence, the proposed method also generalizes well to other image matching tasks, performing comparable result to the state of the art on wide-baseline matching and visual localization benchmarks.

Keywords:
Computer science Artificial intelligence Convolutional neural network Pattern recognition (psychology) Semantics (computer science) Matching (statistics) Dimension (graph theory) ENCODE Feature extraction Image (mathematics) Focus (optics) Mathematics

Metrics

9
Cited By
0.82
FWCI (Field Weighted Citation Impact)
95
Refs
0.72
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.