Classifying fine-grained categories is a popular task nowadays since it can be applied in many daily tasks, such as helping people distinguish different species of animals or different model of vehicles. There are some existing approaches to deal with the task, while few take advantage of generating multiple attention in the image to better recognize the details. In this paper, we propose a novel network: Multiple Recurrent Attention Convolutional Neural Network (MRA-CNN), which uses a Multiple Attention Proposal Network (MAPN) to localize multiple key features and classify the subcategories according to them. The process of localizing attention and classifying each sub-picture is a mutual reinforcement. There will be a specialized loss function for MAPN, which consists of two losses, allowing the network to generate key features that are different from each other and have key information. We conduct our experiments mainly on the dataset CUB-Birds (CUB-200-2011). Our model achieves an overall accuracy of 85.6% which is pretty satisfying.
Ang LiJianxin ChenBin KangWenqin ZhuangXuguang Zhang
Heliang ZhengJianlong FuTao MeiJiebo Luo
Jianlong FuHeliang ZhengTao Mei