JOURNAL ARTICLE

Alignment and localization in fine-grained image recognition

Hanselmann, Harald

Year: 2020 Journal:   RWTH Publications (RWTH Aachen)   Publisher: RWTH Aachen University

Abstract

The goal of image recognition is to identify or recognize objects shown in an image. Image recognition tasks can be classified into different categories with respect to the extent of the inter-class variations. General image recognition tasks typically classify images into a wide variety of broad categories and therefore display large inter-class variation. Fine-grained image classifications tasks, however, are defined by low inter-class variation. Examples of such tasks include the classification of different car models or animal species. A special case of a fine-grained image classification task is face recognition, where individuals have to be classified. For fine-grained tasks, it is not only important to detect which features are in an image, but also where they are located and what their spatial relations are. In this thesis we look at different methods to align and localize features and discriminative regions for fine-grained image classification. On the one hand, we will look at computing dense pixel-wise alignments using 2D-Warping. In this context, we will introduce methods for speeding up the computation of the dense alignments as the runtime is the main drawback of 2D-Warping based approaches. Additionally, we will introduce a new 2D-Warping algorithm that obtains better results in terms of optimization score and classification accuracy compared to previous 2D-Warping algorithms. On the other hand, we will explore a new method to obtain local features needed to compute the dense alignments. These features are learned from data using convolutional neural networks (CNNs). Further, we will introduce a warped region-of-interest pooling layer based on 2D-Warping that can be inserted into a trained CNN to recognize images with spatial deformations not seen in training. We will observe that for the classification accuracy, modeling translation and scaling are most important. For this reason we introduce a localization module that handles translation and scaling variances, is very lightweight and efficient, and needs only class labels to be trained. We then add an embedding layer and global K-max pooling to obtain a complete and efficient system for fine-grained image classification. While the aforementioned localization module is effective, it is implemented in a stand-alone module that is trained separately from the classification model. To simplify the training procedure and leverage the benefits of full end-to-end systems, we transform the localization module such that it can be integrated into the classification model and trained jointly. We evaluate our methods on popular and challenging tasks for fine-grained image classification and are able to report very competitive results. On some tasks we can even report the best state-of-the-art accuracy.

Keywords:
Discriminative model Pattern recognition (psychology) Convolutional neural network Pooling Image (mathematics) Contextual image classification Feature (linguistics) Feature extraction Task (project management)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.26
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Medical Image Segmentation Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Fine-Grained Image Recognition

Xiu-Shen Wei

Synthesis lectures on computer vision Year: 2023 Pages: 33-140
JOURNAL ARTICLE

Semantic-Guided Information Alignment Network for Fine-Grained Image Recognition

Shijie WangZhihui WangHaojie LiJianlong ChangWanli OuyangQi Tian

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 33 (11)Pages: 6558-6570
JOURNAL ARTICLE

Fine-Grained Image Retrieval via Object Localization

Rong WangWei ZouJiajun Wang

Journal:   Electronics Year: 2023 Vol: 12 (10)Pages: 2193-2193
© 2026 ScienceGate Book Chapters — All rights reserved.