Wei ZhouYanke HouDihu ChenHaifeng HuTao Su
The purpose of image multi-label classification is to predict all the object categories presented in an image. Some recent works exploit graph convolution network to capture the correlation between labels. Although promising results have been reported, these methods cannot learn salient object features in the images and ignore the correlation between channel feature maps. In addition, the current researches only learn the feature information within individual input image, but fail to mine the contextual information of various categories from the dataset to enhance the input feature representation. To address these issues, we propose an A ttention- A ugmented M emory N etwork ( AAMN ) model for the image multi-label classification task. Specifically, we first propose a novel categorical memory module to excavate the contextual information of various categories from the dataset to augment the current input feature. Secondly, we design a new channel-relation exploration module to capture the inter-channel relationship of features, so as to enhance the correlation between objects in the images. Thirdly, we develop a spatial-relation enhancement module to model second-order statistics of features and capture long-range dependencies between pixels in feature maps, so as to learn salient object features. Experimental results on standard benchmarks, including MS-COCO 2014, PASCAL VOC 2007, and VG-500, demonstrate the effectiveness and superiority of AAMN model, which outperforms current state-of-the-art methods.
Zheng YanWeiwei LiuShiping WenYin Yang
Jin YuanShikai ChenYao ZhangZhongchao ShiXin GengJianping FanYong Rui
Ying ChenDing ZhangTao HanXiaoliang MengMianxin GaoTeng Wang
Wei ZhouZhiwu XiaPeng DouTao SuHaifeng Hu
Haiying ZhaoWei ZhouXiaogang HouHui Zhu