Hao LiLixin ZhengLaicheng YanKai Tang
Abstract A lightweight 6-DoF grasp-pose estimation framework guided by RGB-D images is presented for robotic manipulation in unstructured settings. The pipeline first employs a Key-Region Grasp Guidance Model (KRGGM) to generate heat-maps that highlight candidate grasp regions and provide grid-based 2-D pose priors. These regions are refined by a Local Point-Cloud Grasp Network (LPCGN) that predicts precise 6-DoF poses, while a Local Geometric Attention module achieves efficient feature fusion from 2D image features to 3D point clouds. Moreover, extensive experiments on the dataset indicate that our approach achieves 3-4 times faster inference on industrial edge devices compared to existing methods while reducing training time by more than 50\%, and maintains competitive accuracy. Real-world experiments on a robotic platform further validate our method's effectiveness, achieving a 94.7% average grasp success rate in single-object scenarios and a 92% scene completion rate in cluttered environments.
Haoran TangBo WangXiaofei ZhouZhimin HanQiang Lv
Haiyuan GuiShanchen PangXiaodong HeLuqi WangXue ZhaiShihang YuKuijie Zhang
Jun ChengPenglei LiuQieshi ZhangHui MaFei WangJin Zhang
Huiyan HanWenjun WangXie HanXiaowen Yang