In this paper, a multi-scale feature fusion pedestrian detection network based on Transformer is designed for small-scale pedestrians and pedestrians disturbed by light and shadow. In the feature extraction stage, the network suppresses the interference of irrelevant features through gating mechanism and feature enhancement, enhances the discrimination of pedestrian features at different scales, dynamically controls the fusion weight of feature maps, and realizes the adaptive fusion of feature maps. In the detection stage, Transformer can capture global information and effectively solve the long-distance dependence mechanism between image pixels to improve the pedestrian detection effect. Finally, compared with the existing methods on the general pedestrian detection dataset, the average accuracy of the proposed method is 6.8 % higher than that of the YOLOv5 model, the false detection rate is reduced by 2.7 %, and the missed detection rate is reduced by 3.1 %. And through the subjective evaluation of the pedestrian detection heat map, this method can detect the human body more comprehensively in the pedestrian detection task, rather than focusing on one point alone. In summary, this method can effectively improve the detection accuracy, reduce the false detection rate and missed detection rate, and improve the pedestrian detection task.
Lincai HuangZhiwen WangXiaobiao Fu
Hao XiaJun MaJiayu OuXinyao LvChengjie Bai
Chaoqi YanHong ZhangXuliang LiYifang YangHao ChenDing Yuan
Ying ZhangLin WuHuaxuan DengJun HuXifan Li