Unmanned Aerial Vehicle (UAV) has the characteristics of "low, slow and small" and complex flight environment. Using existing target recognition algorithms to identify UAV faces high model complexity, large parameter amount, and is prone to missed detection and false detection. Therefore, this paper proposes a special object detection method of YOLOv5 UAV based on Swin-Transformer, namely ST-YOLO. ST-YOLO uses Swin Transformer to integrate into the skeleton network of this improved model and uses other improvements to improve the performance of the model on visual tasks such as object detection. Compared with the original network, ST-YOLO has the parameter amount reduced by 10.70%, the calculation amount reduced by 10.13%, the FPS is basically the same, and the mAP is improved by 3.83%. Therefore, the proposed ST-YOLO has better results in terms of object detection accuracy, calculation amount and model scale.
Jun MaXiao WangCuifeng XuJing Ling
Te LiHuajun WangGuangzhi LiSongshan LiuTang Li
Guiqun CaoHaoyi LuoLingxiao ChenZhuyu ZhouYanfen XinJian Cheng