JOURNAL ARTICLE

Learning multi-scale attention network for fine-grained visual classification

Peipei ZhaoSiyan YangWei DingRuyi LiuWentian XinXiangzeng LiuQiguang Miao

Year: 2025 Journal:   Journal of Information and Intelligence Vol: 3 (6)Pages: 492-503   Publisher: Elsevier BV

Abstract

Fine-grained visual classification (FGVC) is a very challenging task due to distinguishing subcategories under the same super-category. Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods. However, at the same layer, most attention-based works only consider large-scale attention blocks with the same size as feature maps, and they ignore small-scale attention blocks that are smaller than feature maps. To distinguish subcategories, it is important to exploit small local regions. In this work, a novel multi-scale attention network (MSANet) is proposed to capture large and small regions at the same layer in fine-grained visual classification. Specifically, a novel multi-scale attention layer (MSAL) is proposed, which generates multiple groups in each feature maps to capture different-scale discriminative regions. The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features. Then, a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC. Comprehensive experiments in Caltech-UCSD Birds-200-2011 (CUB), FGVC-Aircraft (AIR) and Stanford Cars (Cars) datasets show that our method achieves the competitive performances, which demonstrate its effectiveness.

Keywords:
Computer science Scale (ratio) Artificial intelligence Cartography Geography

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
89
Refs
0.15
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.