Autonomous driving requires perception systems that use computer vision for object detection and segmentation, but these tasks require significant computational power, posing a challenge for low power embedded systems. This paper proposes a multitask learning network for traffic object detection, drivable road lane segmentation, and lane line segmentation, which achieved second place in the Low-power Deep Learning Object Detection and Semantic Segmentation Multitask Model Compression Competition for Traffic Scene in Asian Countries. The model is designed for real-time autonomous driving systems with limited computational resources, achieving real-time inference within 20 milliseconds. The proposed model includes efficient backbone and multitask head architecture, customized classes balance, and optimized training loss. We evaluated the proposed multitask YOLO (MT-YOLO) model on several embedded platforms with AI processing units capable of accelerating quantized neural networks. The proposed model considers highly customized heterogeneous hardware, which can meet real-time requirements on multiple platforms while maintaining accuracy.
Ya YuanWanli DongSicong YangTianya Wu