JOURNAL ARTICLE

Part-Guided Relational Transformers for Fine-Grained Visual Recognition

Yifan ZhaoJia LiXiaowu ChenYonghong Tian

Year: 2021 Journal:   IEEE Transactions on Image Processing Vol: 30 Pages: 9470-9481   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Fine-grained visual recognition is to classify objects with visually similar appearances into subcategories, which has made great progress with the development of deep CNNs. However, handling subtle differences between different subcategories still remains a challenge. In this paper, we propose to solve this issue in one unified framework from two aspects, i.e., constructing feature-level interrelationships, and capturing part-level discriminative features. This framework, namely PArt-guided Relational Transformers (PART), is proposed to learn the discriminative part features with an automatic part discovery module, and to explore the intrinsic correlations with a feature transformation module by adapting the Transformer models from the field of natural language processing. The part discovery module efficiently discovers the discriminative regions which are highly-corresponded to the gradient descent procedure. Then the second feature transformation module builds correlations within the global embedding and multiple part embedding, enhancing spatial interactions among semantic pixels. Moreover, our proposed approach does not rely on additional part branches in the inference time and reaches state-of-the-art performance on 3 widely-used fine-grained object recognition benchmarks. Experimental results and explainable visualizations demonstrate the effectiveness of our proposed approach.

Keywords:
Discriminative model Computer science Artificial intelligence Embedding Inference Transformer Pattern recognition (psychology) Feature extraction Pixel Feature (linguistics) Transformation (genetics) Deep learning Machine learning Natural language processing Engineering

Metrics

64
Cited By
4.29
FWCI (Field Weighted Citation Impact)
92
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

DISSERTATION

Visual fine-grained recognition

Marcel Simon

University:   Thüringer Universitäts- und Landesbibliothek Year: 2019
JOURNAL ARTICLE

Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition

Dichao LiuYu WangJien Kato

Journal:   IEICE Transactions on Information and Systems Year: 2019 Vol: E102.D (12)Pages: 2577-2586
© 2026 ScienceGate Book Chapters — All rights reserved.