Lubna AzizMd. Sah bin Haji Salam FcUsman Ullah SheikhSamiullah KhanHuma AyubSara Ayub
Object detection becomes a challenge due to diversity of object scales. In general, modern object detectors use feature pyramid to learn multi-scale representation for better results. However, current versions of feature pyramid are insufficient to handle the scale imbalance, as it is inefficient to integrate the semantic information across different scales. In this paper, we reformulate the feature pyramid construction as a feature reconfiguration process. Finally, we propose a novel detection network, Multi-level Refinement Feature pyramid Network (MRFPN), to combine the high-level features (i.e., semantic information), middle-level feature and low-level feature (i.e., boundary information), in a highly-nonlinear yet efficient manner. In particular, a novel contextual features module (chain parallel pooling) is proposed, which consists of global attention and local reconfigurations. It efficiently gathers task-oriented contextual features across different scales and spatial locations (i.e., lightweight local reconfiguration and global attention). To evaluate significance of proposed model, we designed and trained a robust end-to-end single stage detector called MRFDet by assimilating it into a conventional SSD model, and it achieved better detection performance compared to most recent single-stage objects detectors. In particular, MRFDet achieves an AP of 45.2 with MS-COCO and an improvement in the map of 4.5% with VOC compared to conventional SSD. We are releasing the source code for our proposed model MRFDet, to facilitate the research community.
Lubna AzizMd Sah SalamUsman Ullah SheikhSurat KhanHuma AyubSara Ayub
Lubna AzizMd. Sah bin Haji Salam FcUsman Ullah SheikhSamiullah KhanHuma AyubSara Ayub
Lubna AzizMd. Sah Bin Haji Salam FCSara Ayub
Zebin GuoHui ShuaiGuangcan LiuYisheng ZhuWenqing Wang