Huapeng XuGuilin QiJingjing LiMeng WangKang XuHuan Gao
This paper investigates a challenging problem,which is known as fine-grained image classification(FGIC). Different from conventional computer visionproblems, FGIC suffers from the large intraclassdiversities and subtle inter-class differences.Existing FGIC approaches are limited to exploreonly the visual information embedded in the images.In this paper, we present a novel approachwhich can use handy prior knowledge from eitherstructured knowledge bases or unstructured text tofacilitate FGIC. Specifically, we propose a visual-semanticembedding model which explores semanticembedding from knowledge bases and text, andfurther trains a novel end-to-end CNN frameworkto linearly map image features to a rich semanticembedding space. Experimental results on a challenginglarge-scale UCSD Bird-200-2011 datasetverify that our approach outperforms several state-of-the-art methods with significant advances.
Soranan PayatsupornBoonserm Kijsirikul
Da-Cheng JuanChun-Ta LuZhen LiFutang PengAleksei TimofeevYi-Ting ChenYaxi GaoTom DuerigAndrew TomkinsSujith Ravi
Hao LiHuiling ChenYifei ChenRongshan YuWenxian YangLiansheng WangBowen DingYuchen Han