In this paper, we propose a novel pipeline to estimate 6D object pose from RGB-D images of known objects present in complex scenes. The pipeline directly operates on raw point clouds extracted from RGB-D scans. Specifically, our method takes the point cloud as input and regresses the point-wise unit vectors pointing to the 3D keypoints. We then use these vectors to generate keypoint hypotheses from which the 6D object pose hypotheses are computed. Finally, we select the best 6D object pose from the hypotheses based on a proposed scoring mechanism with geometry constraints. Extensive experiments show that the proposed method is robust against the variety in object shape and appearance as well as occlusions between objects, and that our method outperforms the state-of-the-art methods on the LINEMOD and Occlusion LINEMOD datasets.
Jianxin RenJinghua WuYalei Liu
Liang MeiJingen LiuAlfred O. HeroSilvio Savarese