In this paper we propose a lightweight object detector by optimizing the structure of YOLOv3. The optimization is conducted in two aspects: simplifying the structural components by lightweight substitutes and introducing the attention mechanism to increase the detection accuracy. For the simplification, we remodel the backbone based on MobileNet v2 and replace every 3×3 convolution in the detection neck and head by the fusion of a 3 × 3 depthwise separable convolution and a squeeze and excitation block (DSConv+SE); for the attention enhancement, we introduce the high-frequency wavelets of the original image to the input, a simplified non- local block to the simplified backbone and convolutional block attention modules to the simplified detection neck. In addition, local 3×3 convolution branches are introduced to the simplified backbone for enhanced learning capability. Experiments demonstrate that the proposed detector outperforms each compared state-of-the-art work in one or more aspects.
Jingxin AnTao PengMuhamad Dwisnanto PutroByeong-Woo Kim
Yu ZhangWenhui ChenSonglin LiHailong LiuQing Hu
Xianghong HeYue ZhangQiang Zhan
Y. H. ZhengYan ZhaoTiantian Liu
Qunpo LiuJingwen ZhangZhuoran ZhangXuhui BuNaohiko HANAJIMA