Spatio-temporal graph convolutional network methods have achieved impressive results in skeleton-based action recognition. However, existing methods treat spatiotemporal features about joint correlation equally, ignoring the difference in contribution from different spatio-temporal patterns along the channel dimension, thus losing the ability to distinguish confusing actions with subtle differences. In this paper, a spatio-temporal-aware channel excitation (STACE) module is proposed to explore channel-wise discriminative features of actions. More specifically, a spatio-aware channel excitation (SACE) sub-module is incorporated to capture the global body structure patterns to excite the spatial-sensitive channels, which helps the network to focus on vital channels of crucial joints. Similarly, a temporal-aware channel excitation (TACE) submodule is designed to learn inter-frame dynamics and to excite the temporal-sensitive channels, which aims to explore the more informative dynamic patterns along the channel dimension. Finally, equipped with STACE, a spatio-temporal-aware channel excited graph convolutional network (STACE-GCN) is proposed and evaluated on two large datasets NTU RGB +D and NTU RGB +D 120, showing that our method outperforms SOTAs.
Ji MaWei LiuLinlin DingHao Luo
Runjie LiNing HeChaoqun WangRuicheng WangWenhua Wang
Ping YangQin WangHao ChenZizhao Wu
Zhiyun ZhengQilong YuanHuaizhu ZhangYizhou WangJunfeng Wang