In recent years, with the rapid development of drone photography technology, drones have emerged as a new method for photography and observation. We have applied object detection technology to drone photography and proposed a YOLO-style 2D object detection network called YOLOvis based on the YOLOv8 architecture. We introduced the PatchFPN structure to improve the network's robustness in detecting small objects and proposed the TranConcat structure to scale features for enhanced usability. YOLOvis utilizes the challenging VisDrone data set as the model's task data set. With a slight increase in computational cost, the model achieves an accuracy of mAP50 33.7%, which is a 3.7% improvement compared to YOLOv8 of the same size and a 17.0% improvement compared to YOLOv5 of the same size. Additionally, to better tailor the model to specific needs, we have proposed YOLOvis models of different sizes. The code and pretrained models are in https://github.com/BarryGUN/bgyolo_v8_application.git.
Fauzan MasykurAngga PrasetyoIsmail Abdurrozaq ZulkarnainEllisia KumalasariPradityo Utomo
Xianxu ZhaiHuang Zhi-huaTao LiHanzheng LiuSiyuan Wang
Shuaihui QiXiaofeng SongTongfei ShangXiaochang HuKun Han