Yuanyuan ZhuYingying CaiYe Liu
The analysis of human actions through skeletal patterns takes a prominent role in research. In contrast to methods utilizing RGB images or optical flow data for action recognition, skeletal information demonstrates intrinsic re-silience, unaffected by dynamic background, lighting variations, and other environmental factors. Hence, the exploration of action recognition through the human skeleton holds significant research merit. ST-GCN (Spatio-temporal Graph Convolutional Network) is an advanced performance model in human skeleton action recognition. ST-GCN mainly consists of temporal and graph convolution, which process data in two dimensions respectively, but these two parts are split when performing data processing, and the data are not effectively communicated with each other. In addressing this challenge, we introduce a spatio-temporal fusion module (STFM) that seamlessly integrates graph convolution and temporal convolution. This integration effectively merges temporal and spatial dimensions, facilitating improved communication between them. Through the incorporation of STFM into ST-GCN and its combination with dynamic topology, we formulated a spatio-temporal graph convolutional network for recognizing skeleton behaviors. Experimental findings on the NTU RGB+D and NTU RGB+D 120 datasets demonstrate the superiority of our proposed method in both recognition accuracy and model complexity when compared to mainstream behavior recognition approaches.
Zhen HuangXu ShenXinmei TianHouqiang LiJianqiang HuangXian‐Sheng Hua
Ping YangQin WangHao ChenZizhao Wu
Chongyang DingShan WenWenwen DingKai LiuEvgeny Belyaev
Sijie YanYuanjun XiongDahua Lin