Human action recognition (HAR) has attracted the attention of researcher because of its widespread applicability. With the rapid development of deep learning technology, HAR has been greatly improved by deep features. However, many challenges remain, including a shortage of training samples and the effects of view variation the ineffective spatial-temporal information features. To address these problems and further improve the accuracy of HAR, we proposed a novel HAR method based on human skeletons and joints. First, a coordinate transformation method was performed on the raw skeleton data to eliminate the influence of the camera position. Then, a data augmentation strategy was proposed to address the overfitting problem caused by an insufficient number of training samples. In our proposed method, a motion data structure named APoM, which is composed of the cross-frame distance vector, the specific angle and the position vector, connects the movement of joints and the skeleton in the spatial and temporal dimensions and captures skeleton motion details. To evaluate the effectiveness of our method, experiments were conducted on two small-scale public datasets: Florence3D, UTKinectAction3D. The experimental results show that the proposed method achieved competitive performance with accuracies of 98.51%, 98.33%.
Leiyang XuQiang WangXiaotian LinLin YuanXiang Ma
Qipeng ZhangTian WangMingjie ZhangKexin LiuPeng ShiHichem Snoussi