This paper presents a visual attention based convolutional neural network (CNN) to solve the image classification problem in the real complex world scene. The presented method can simulate the process of recognizing objects and find the area of interest which is related with the task. Compared with the CNN method in image classification, the model is proficient in fine-grained classification problem and has a better robustness due to its mechanism of multi-glance and visual attention. We evaluate the model on vehicle dataset, where its performance exceeds CNN baseline on image classification.
Xiaohong CaiMing LiHui CaoJingang MaXiaoyan WangXuqiang Zhuang
Jiaqi ShaoChangwen QuJianwei LiShujuan Peng
张祥东 Zhang Xiangdong王腾军 Wang Tengjun朱劭俊 Zhu Shaojun杨耘 Yang Yun
Yuntao LiuYong DouRuochun JinPeng Qiao
Anjun ZhangLu JiaJun WangChuanjian Wang