Fine-grained crops, such as rice, dried tea leaves, are small in shape and usually densely overlapped in images. A single sample of such an object can't represent the features of a cluster of samples. This poses significant challenges when recognizing this line of objects. In this paper, We use mobile phone cameras to collect images of fine-grained crops (as shown in Fig. 1.), and propose a Hierarchical Convolution Neural Network (H-CNN) based on attention mechanism, to efficiently classify the fine-grained crops images, tea with ranked quality as a case study. We established classification models for four categories of tea (namely Meitan Turquoise Bud (MTB), Zunyi black tea, Biluochun, and Longjing tea), each one having five grades by quality. The major results include: (1) The model trained by images using one single mobile phone has very poor generalization ability whereby test accuracy is low on images collected by other mobile phones. When using the images collected by two different mobile phones for training, the model has significantly higher test accuracy on the third phone. When using three or more mobile phones for training, the further improvement is marginal (as shown in Fig. 2). (2) H-CNN with attention mechanism has an average accuracy of more than 93%, and the prediction accuracy of images taken by other mobile phones can also reach over 87%, which is superior to the existing CNN models (89% and 82.7% respectively from using VGG19 [29]).
Jingyang SheLirong YanWenjiang LiuFuwu YanYibo Wu
Zhiwen ZhengJuxiang ZhouJianhou GanSen LuoWei Gao
Guofeng YangYong HeYong YangBeibei Xu
Sri Teja AllaparthiGanesh YaparlaVikram Pudi
Sifeng WangShengxiang LiAnran LiZhaoan DongGuangshun LiChao Yan