Learning multi-scale attention network for fine-grained visual classification

Peipei Zhao; Siyan Yang; Wei Ding; Ruyi Liu; Wentian Xin; Xiangzeng Liu; Qiguang Miao

doi:10.1016/j.jiixd.2025.04.005

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning multi-scale attention network for fine-grained visual classification

Peipei Zhao Siyan Yang Wei Ding Ruyi Liu Wentian Xin Xiangzeng Liu Qiguang Miao

Year: 2025 Journal: Journal of Information and Intelligence Vol: 3 (6)Pages: 492-503 Publisher: Elsevier BV

DOI: 10.1016/j.jiixd.2025.04.005

Get Full-Text PDF Get Analytical Report

Abstract

Fine-grained visual classification (FGVC) is a very challenging task due to distinguishing subcategories under the same super-category. Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods. However, at the same layer, most attention-based works only consider large-scale attention blocks with the same size as feature maps, and they ignore small-scale attention blocks that are smaller than feature maps. To distinguish subcategories, it is important to exploit small local regions. In this work, a novel multi-scale attention network (MSANet) is proposed to capture large and small regions at the same layer in fine-grained visual classification. Specifically, a novel multi-scale attention layer (MSAL) is proposed, which generates multiple groups in each feature maps to capture different-scale discriminative regions. The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features. Then, a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC. Comprehensive experiments in Caltech-UCSD Birds-200-2011 (CUB), FGVC-Aircraft (AIR) and Stanford Cars (Cars) datasets show that our method achieves the competitive performances, which demonstrate its effectiveness.

Keywords:

Computer science Scale (ratio) Artificial intelligence Cartography Geography

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.15

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Learning multi-scale attention network for fine-grained visual classification

Abstract

Metrics

Topics

Related Documents

Multi-Scale Attention Constraint Network for Fine-Grained Visual Classification

Attention-based Multi-scale ViT Fine-grained Visual Classification

Multi-scale network via progressive multi-granularity attention for fine-grained visual classification

Multi-level Attention-enhanced Learning for Fine-Grained Visual Classification

Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization