Palak AgarwalSomya GoelS.K. BhagatRahul Singh
Abstract: With uses in robotics, industrial automation, autonomous vehicles, and surveillance, object detection is a basic computer vision problem. Within the context of the COCO dataset, this work compares the performance of several state-of-theart object recognition models, including Mask R-CNN (Detectron2), YOLOv8s, YOLOv8l, and YOLOv11s. Some of the significant parameters such as mean Average Precision (mAP), precision, recall, and inference speed are utilized to compare models. The results indicate that while Mask R-CNN is accurate, its computation makes it less suitable for real-time use. YOLO models, particularly YOLOv8s, are however a compromise between accuracy and speed and thus are ideal for real-time detection processes. YOLOv8l is however computationally more demanding but somewhat offers higher accuracy. Due to its speed and accuracy, YOLOv8s is the most suitable model to apply in real-time, as stated in the review. In selecting the most suitable object detection models for various applications, researchers and developers can learn a lot from this study
Chengamma ChittetiS MamathaPreetha ChandraSufia BanuS.N. Reddy
Muhammad Mudassir EjazTong Boon TangCheng‐Kai Lu