Hand gestures are widely used in human-computer interaction, and dynamic gesture recognition is still a challenging task. In this paper, a Dempster-Shafer evidence theory based multimodal human gesture recognition method is proposed. Firstly, the audio-based and skeleton-based command recognition models are established. Then, an alignment method for multimodal recognition results of continuous gestures is proposed to combine the recognition results of the audio-based model and the skeleton-based model for the same action into the same group. Furthermore, for the results in each group, the Dempster-Shafer evidence theory is used for fusion. Finally, the performance of our method is evaluated using the ChaLearn Multi-modal Gesture Recognition dataset. The results show that this method can effectively improve the recognition accuracy of dynamic gestures by fusing information from audio and skeleton.
Qihua XuBo SunJun HeBojie RongLejun YuPenghao Rao
Thomas BürgerOya AranAlexandra UrankarAlice CaplierLale Akarun