Abstract: Withusesinrobotics,industrialautomation,autonomousvehicles,andsurveillance,objectdetectionis a basic computer vision problem. WithinthecontextoftheCOCOdataset,thisworkcomparestheperformanceof severalstate-of-the-artobjectrecognition models,includingMaskRCNN (Detectron2),YOLOv8s,YOLOv8l,and YOLOv11s.Some of the significant parameters such as mean Average Precision (mAP), precision, recall, and inference speed are utilized to compare models. The results indicate that while Mask R-CNN is accurate, its computation makes it less suitable for real-time use. YOLOmodels,particularlyYOLOv8s,arehoweveracompromisebetweenaccuracyandspeedandthusareideal for real-time detection processes.YOLOv8l is however computationally more demanding but somewhat offers higheraccuracy. Duetoitsspeedand accuracy,YOLOv8sisthemostsuitablemodeltoapplyinreal-time,asstated in the review. In selecting the most suitable object detection models for various applications, researchers and developers can learn a lot from this study.
Palak AgarwalSomya GoelS.K. BhagatRahul Singh
Chengamma ChittetiS MamathaPreetha ChandraSufia BanuS.N. Reddy
Muhammad Mudassir EjazTong Boon TangCheng‐Kai Lu