Jingxin AnTao PengMuhamad Dwisnanto PutroByeong-Woo Kim
Object detection is a fundamental computer vision task that simultaneously locates and categorizes objects in images and videos. It is utilized in various fields, such as autonomous driving, surveillance, and industrial automation. However, object detection tasks must find a balance between accuracy and computational efficiency, particularly in resource-constrained environments. This study proposes Mamba*YOLO, which combines Mamba’s linear complexity with YOLOv12’s attention mechanisms through a novel Regional Attention with Gated Enhancement (RAGE) module. RAGE addresses the locality limitations of existing approaches by integrating regional attention with multiplicative feature enhancement. Experimental results on the MS COCO and PASCAL VOC datasets indicated 3.6% and 2.9% improvements to mean average precision, respectively, compared with Mamba-YOLO, while achieving 24% fewer parameters than YOLOv11 and a 27% reduction in GFLOPs. These findings demonstrate that the adopted regional adaptive gating approach can effectively bridge the gap between computational efficiency and detection accuracy, enabling its use for object detection in real-time applications.
Shuqi SunXiaohui YangJingliang Peng
Haoyu ZhangHaifeng SongZesheng ChenMin ZhouYu ChengHairong Dong
Yu ZhangWenhui ChenSonglin LiHailong LiuQing Hu