JOURNAL ARTICLE

Hybrid Granularities Transformer for Fine-Grained Image Recognition

Ying YuJinghui Wang

Year: 2023 Journal:   Entropy Vol: 25 (4)Pages: 601-601   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Many current approaches for image classification concentrate solely on the most prominent features within an image, but in fine-grained image recognition, even subtle features can play a significant role in model classification. In addition, the large variations in the same class and small differences between different categories that are unique to fine-grained image recognition pose a great challenge for the model to extract discriminative features between different categories. Therefore, we aim to present two lightweight modules to help the network discover more detailed information in this paper. (1) Patches Hidden Integrator (PHI) module randomly selects patches from images and replaces them with patches from other images of the same class. It allows the network to glean diverse discriminative region information and prevent over-reliance on a single feature, which can lead to misclassification. Additionally, it does not increase the training time. (2) Consistency Feature Learning (CFL) aggregates patch tokens from the last layer, mining local feature information and fusing it with the class token for classification. CFL also utilizes inconsistency loss to force the network to learn common features in both tokens, thereby guiding the network to focus on salient regions. We conducted experiments on three datasets, CUB-200-2011, Stanford Dogs, and Oxford 102 Flowers. We achieved experimental results of 91.6%, 92.7%, and 99.5%, respectively, achieving a competitive performance compared to other works.

Keywords:
Discriminative model Computer science Salient Artificial intelligence Pattern recognition (psychology) Feature (linguistics) Security token Consistency (knowledge bases) Class (philosophy) Image (mathematics) Feature extraction Focus (optics) Contextual image classification Transformer Machine learning Engineering

Metrics

8
Cited By
1.46
FWCI (Field Weighted Citation Impact)
33
Refs
0.78
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Group-Attention Transformer for Fine-Grained Image Recognition

Bo YanSiwei WangEn ZhuXinwang LiuWei Chen

Communications in computer and information science Year: 2022 Pages: 40-54
JOURNAL ARTICLE

Fine grained food image recognition based on swin transformer

Zhiyong XiaoGuang DiaoZhaohong Deng

Journal:   Journal of Food Engineering Year: 2024 Vol: 380 Pages: 112134-112134
JOURNAL ARTICLE

Structural feature enhanced transformer for fine-grained image recognition

Ying YuWei WeiCairong ZhaoJin QianEnhong Chen

Journal:   Pattern Recognition Year: 2025 Vol: 169 Pages: 111955-111955
BOOK-CHAPTER

Fine-Grained Image Recognition

Xiu-Shen Wei

Synthesis lectures on computer vision Year: 2023 Pages: 33-140
© 2026 ScienceGate Book Chapters — All rights reserved.