Lihuang SheHuaidan LiangHaonan SunYaojing Chen
In real-world scenarios, gesture point cloud data often suffer from distortions which severely affect recognition accuracy. To address this issue, this paper proposes a 3D hand gesture pose estimation algorithm based on a denoising diffusion probabilistic model (HDDPM) to enhance the robustness under distorted conditions. This study first categorizes common types of distortions in gesture point cloud data and simulates real-world distortion scenarios including local occlusion, global point loss, and noise interference to generate an augmented dataset. In the forward diffusion process, artificial noise is added to the point cloud data, progressively degrading it into noisy observations. The reverse denoising process enables the model to learn how to restore the original, clean data from the corrupted input. Moreover, spatial attention (SA) and channel attention (CA) mechanisms are integrated to enhance feature extraction capability. Experiments are conducted on three public datasets: MSRA, ICVL, and NYU. The method achieves Mean Per Joint Position Error (MPJPE) scores of 7.66 mm, 6.71 mm, and 10.05 mm on the MSRA, ICVL, and NYU datasets, respectively, outperforming existing methods such as Hand PointNet and other mainstream approaches. Overall, the method holds great potential for practical applications in human–computer interaction, augmented reality (AR), and robotic control. Some parts of the core code have been made publicly available at https://github.com/ynslyyx/HDDPM .
Maksym IvashechkinOscar MéndezRichard Bowden
Albert CausoMai MatsuoEtsuko UedaKentaro TakemuraYoshio MatsumotoJun TakamatsuTsukasa Ogasawara
Jeongjun ChoiDongseok ShimH. Jin Kim
Christopher G. SchwarzN. da Vitoria Lobo