JOURNAL ARTICLE

Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition

Jiayin SunHong WangQiulei Dong

Year: 2023 Journal:   IEEE Transactions on Circuits and Systems for Video Technology Vol: 34 (5)Pages: 3891-3904   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Triggered by the success of transformers in various visual tasks, the spatial self-attention mechanism has recently attracted more and more attention in the computer vision community. However, we empirically found that a typical vision transformer with the spatial self-attention mechanism could not learn accurate attention maps for distinguishing different categories of fine-grained images. To address this problem, motivated by the temporal attention mechanism in brains, we propose a hierarchical attention network for learning fine-grained feature representations, called HAN, where the features learnt by implementing a sequence of spatial self-attention operations corresponding to multiple moments are aggregated progressively. The proposed HAN consists of four modules: a self-attention backbone module for learning a sequence of features with self-attention operations, a spatial feature self-organizing module for facilitating the model training, a hierarchical aggregation module for aggregating the re-organized features via a Long Short-Term Memory network, and a context-aware module that is implemented as the forget block of the hierarchical aggregation module for preserving/forgetting the long-term memory by utilizing contextual information. Then, we propose a HAN-based method for open-set fine-grained recognition by integrating the proposed HAN network with a linear classifier, called HAN-OSFGR. Extensive experimental results on 3 fine-grained datasets and 2 coarse-grained datasets demonstrate that the proposed HAN-OSFGR outperforms 9 state-of-the-art open-set recognition methods significantly in most cases.

Keywords:
Computer science Forgetting Artificial intelligence Pattern recognition (psychology) Spatial contextual awareness Classifier (UML) Set (abstract data type)

Metrics

15
Cited By
2.73
FWCI (Field Weighted Citation Impact)
74
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.