Jichu OuWanyi LiJingmin HuangXiaojie HuangXuan Xie
Abstract Fine‐grained visual categorization (FGVC) is a challenging task, facing the issues such as inter‐class similarities, large intra‐class variances, scale variation, and angle variation. To address these issues, the authors propose a novel multiscale attention dynamic aware network (MADA‐Net). The core of network consists of three parallel sub‐networks, which learn features from different scales. Each sub‐network is composed of three serial sub‐modules: (1) A self‐attention module (SAM) locates objects according to relative importance scattered throughout feature map. (2) A multiscale feature extractor (MFE) learns the non‐linear features of objects. (3) A dynamic aware module (DAM) enhances the learning capability of spatial deformation of the network to generate high‐quality feature map. In addition, the authors propose a multiscale adjusted loss (MA‐Loss) to improve the performance of network. Experiments on three prevailing benchmark datasets demonstrate that our method can achieve state‐of‐the‐art performance.
Yili RenRuidong LuGuan YuanDan HaoHongjue Li
Bin KangDong LiangDaoyuan ChenTianyu DingMingqiang Wei
Cheng PangHongxun YaoXiaoshuai SunSicheng ZhaoYanhao Zhang
Zhao-Xu LuoMin‐Hsiang HungYiwen LuKuan‐Wen Chen