Visual fine-grained recognition

Marcel Simon

doi:10.22032/dbt.40356

ScienceGate Book Chapters

DISSERTATION

Visual fine-grained recognition

Marcel Simon

Year: 2019 University: Thüringer Universitäts- und Landesbibliothek

DOI: 10.22032/dbt.40356

Get Full-Text PDF Get Analytical Report

Abstract

Object recognition in digital images is crucial for further automation in everyday life and industry. Basic objects can be distinguished well already, but the automated recognition of detailed categories, called fine-grained recognition, remains challenging. Approaches in this field are usually based on an explicit or implicit normalization of the object pose. Explicit approaches describe an object by the appearance of its parts. Most previous works use annotated locations of semantic parts in all training images. However, annotations are expensive to obtain. Implicit approaches compute numerous local features and aggregate them without considering their spatial position. This leads to an implicit matching of the appearance of corresponding parts in the distance function of the classifier. The concept does not require annotated part locations, but the resulting features are not necessarily optimal. Reasons are that the features might not lie on a Euclidean manifold and that the aggregation strategy is manually chosen using validation data. In this thesis, we address drawbacks of previous approaches with novel recognition and visualization techniques. We present approaches for explicit pose normalization, which do not require part annotations. They are based on generating numerous generic part proposals and selecting relevant ones for classification. Existing implicit approaches are also improved by addressing their main issues. For example, we introduce a novel generalized aggregation scheme, which allows for learning the optimal strategy. The recognition approaches are complemented with two visualizations. We also analyze and predict the influence of random noise on recognition models. We extensively evaluate and discuss all presented ideas in a qualitative and quantitative manner using widely used benchmark datasets. Our recognition approaches successfully improve the accuracy of the base CNNs by up to 20.6% and even work in other domains like action recognition.

Keywords:

Computer science Artificial intelligence Normalization (sociology) Cognitive neuroscience of visual object recognition Classifier (UML) Visualization Machine learning Pattern recognition (psychology) Object (grammar)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.24

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Neural Networks and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Visual fine-grained recognition

Abstract

Metrics

Topics

Related Documents

Multi-View Active Fine-Grained Visual Recognition

Deep learning for fine-grained visual recognition

Annotation modification for fine-grained visual recognition

Fine-grained Visual Recognition based on Prototypical Mamba

Bilinear CNN Models for Fine-Grained Visual Recognition