JOURNAL ARTICLE

Deep learning-based object detection and robotic arm grasping

ZHANG LeiZHANG SenhuiYAN SongYUAN Yuan

Year: 2024 Journal:   DOAJ (DOAJ: Directory of Open Access Journals)

Abstract

Addressing the issues of slow speed and poor performance in multi-object grasping detection in unstructured environments, a method of performing object detection before grasping detection is proposed. In object detection, to accelerate the network′s running speed, this paper improved the YOLOv5 network by employing depth wise separable convolutions and coordinate attention mechanisms. For the grasping task, a single-stage grasping pose detection algorithm was designed. Firstly, considering the interference present in unstructured environments, RGB-D images were selected as the input data for the grasping network, and GG-CNN was chosen as the backbone network. Secondly, to enhance the feature extraction capabilities of the grasping network, the parallel use of different-size convolutional kernels in the Inception-ResNet module was utilized to broaden the network's receptive field. Additionally, the integration of a parameter-free three-dimensional attention mechanism enabled the network to focus more on grasping information features and suppress background noise. Finally, a grasping quality evaluation was employed to refine the grasping boxes, and the grasping box with the highest confidence score was output. The experimental results indicate that the improved object detection network has a parameter count of 2 776 708 and achieves 102 frames per second (FPS). On the public Cornell dataset, the improved grasping detection network achieves an accuracy of 96.57% with a FPS of 54.17. The combination of the two improved networks can be deployed on robotic arms and effectively accomplish grasping tasks in multi-object scenarios, making them suitable for practical industrial applications.

Keywords:
Object detection Focus (optics) Object (grammar) Feature extraction Robotic hand Robotic arm Pose Feature (linguistics) Convolutional neural network

Metrics

1
Cited By
0.64
FWCI (Field Weighted Citation Impact)
0
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Robot Manipulation and Learning
Physical Sciences →  Engineering →  Control and Systems Engineering
Hand Gesture Recognition Systems
Physical Sciences →  Computer Science →  Human-Computer Interaction
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.