DISSERTATION

Discriminative Feature Learning with Application to Fine-grained Recognition

Yaming Wang

Year: 2018 University:   University Libraries (University of Maryland)   Publisher: University of Maryland, College Park

Abstract

For various computer vision tasks, finding suitable feature representations is fundamental. Fine-grained recognition, distinguishing sub-categories under the same super-category (e.g., bird species, car makes and models, etc.), serves as a good task to study discriminative feature learning for visual recognition task. The main reason is that the inter-class variations between fine-grained categories are very subtle and even smaller than intra-class variations caused by pose or deformation. This thesis focuses on tasks mostly related to fine-grained categories. After briefly discussing our earlier attempt to capture subtle visual differences using sparse/low-rank analysis, the main part of the thesis reflects the trends in the past a few years as deep learning prevails. In the first part of the thesis, we address the problem of fine-grained recognition via a patch-based framework built upon Convolutional Neural Network (CNN) features. We introduce triplets of patches with two geometric constraints to improve the accuracy of patch localization, and automatically mine discriminative geometrically-constrained triplets for recognition. In the second part we begin to learn discriminative features in an end-to-end fashion. We propose a supervised feature learning approach, Label Consistent Neural Network, which enforces direct supervision in late hidden layers. We associate each neuron in a hidden layer with a particular class and encourage it to be activated for input signals from the same class by introducing a label consistency regularization. This label consistency constraint makes the features more discriminative and tends to faster convergence. The third part proposes a more sophisticated and effective end-to-end network specifically designed for fine-grained recognition, which learns discriminative patches within a CNN. We show that patch-level learning capability of CNN can be enhanced by learning a bank of convolutional filters that capture class-specific discriminative patches without extra part or bounding box annotations. Such a filter bank is well structured, properly initialized and discriminatively learned through a novel asymmetric multi-stream architecture with convolutional filter supervision and a non-random layer initialization. In the last part we goes beyond obtaining category labels and study the problem of continuous 3D pose estimation for fine-grained object categories. We augment three existing popular fine-grained recognition datasets by annotating each instance in the image with corresponding fine-grained 3D shape and ground-truth 3D pose. We cast the problem into a detection framework based on Faster/Mask R-CNN. To utilize the 3D information, we also introduce a novel 3D representation, named as location field, that is effective for representing 3D shapes.

Keywords:
Discriminative model Artificial intelligence Feature (linguistics) Pattern recognition (psychology) Computer science Feature learning Machine learning Linguistics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

A Scale-Aware and Discriminative Feature Learning Network for Fine-Grained Rigid Object Recognition

Yu GaoChenwei DengLiang ChenZicong Zhu

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2024 Vol: 18 Pages: 1695-1705
JOURNAL ARTICLE

Fine-grained pornographic image recognition with multiple feature fusion transfer learning

Xinnan LinFeiwei QinYong PengYanli Shao

Journal:   International Journal of Machine Learning and Cybernetics Year: 2020 Vol: 12 (1)Pages: 73-86
JOURNAL ARTICLE

Discriminative Feature Mining and Enhancement Network for Low-Resolution Fine-Grained Image Recognition

Tiantian YanHaojie LiBaoli SunZhihui WangZhongxuan Luo

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2022 Vol: 32 (8)Pages: 5319-5330
© 2026 ScienceGate Book Chapters — All rights reserved.