DISSERTATION

Visual fine-grained recognition

Marcel Simon

Year: 2019 University:   Thüringer Universitäts- und Landesbibliothek

Abstract

Object recognition in digital images is crucial for further automation in everyday life and industry. Basic objects can be distinguished well already, but the automated recognition of detailed categories, called fine-grained recognition, remains challenging. Approaches in this field are usually based on an explicit or implicit normalization of the object pose. Explicit approaches describe an object by the appearance of its parts. Most previous works use annotated locations of semantic parts in all training images. However, annotations are expensive to obtain. Implicit approaches compute numerous local features and aggregate them without considering their spatial position. This leads to an implicit matching of the appearance of corresponding parts in the distance function of the classifier. The concept does not require annotated part locations, but the resulting features are not necessarily optimal. Reasons are that the features might not lie on a Euclidean manifold and that the aggregation strategy is manually chosen using validation data. In this thesis, we address drawbacks of previous approaches with novel recognition and visualization techniques. We present approaches for explicit pose normalization, which do not require part annotations. They are based on generating numerous generic part proposals and selecting relevant ones for classification. Existing implicit approaches are also improved by addressing their main issues. For example, we introduce a novel generalized aggregation scheme, which allows for learning the optimal strategy. The recognition approaches are complemented with two visualizations. We also analyze and predict the influence of random noise on recognition models. We extensively evaluate and discuss all presented ideas in a qualitative and quantitative manner using widely used benchmark datasets. Our recognition approaches successfully improve the accuracy of the base CNNs by up to 20.6% and even work in other domains like action recognition.

Keywords:
Computer science Artificial intelligence Normalization (sociology) Cognitive neuroscience of visual object recognition Classifier (UML) Visualization Machine learning Pattern recognition (psychology) Object (grammar)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.24
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Deep learning for fine-grained visual recognition

Teng Li

Journal:   Adelaide Research & Scholarship (AR&S) (University of Adelaide) Year: 2017
JOURNAL ARTICLE

Annotation modification for fine-grained visual recognition

Changzhi LuoZhijun MengJiashi FengBingbing NiMeng Wang

Journal:   Neurocomputing Year: 2016 Vol: 274 Pages: 58-65
© 2026 ScienceGate Book Chapters — All rights reserved.