Dual-Excitation Spatial–Temporal Graph Convolution Network for Skeleton-Based Action Recognition

Jian Lü; Tingting Huang; Bo Zhao; Xiaogai Chen; Jian Zhou; Kaibing Zhang

doi:10.1109/jsen.2024.3354922

ScienceGate Book Chapters

JOURNAL ARTICLE

Dual-Excitation Spatial–Temporal Graph Convolution Network for Skeleton-Based Action Recognition

Jian Lü Tingting Huang Bo Zhao Xiaogai Chen Jian Zhou Kaibing Zhang

Year: 2024 Journal: IEEE Sensors Journal Vol: 24 (6)Pages: 8184-8196 Publisher: IEEE Sensors Council

DOI: 10.1109/jsen.2024.3354922

Get Full-Text PDF Get Analytical Report

Abstract

The crucial issue in current methods for skeleton-based action recognition: how to comprehensively capture the evolving features of global context information and temporal dynamics, and how to extract discriminative representations from skeleton joints and body parts. To address these issue, this paper proposes a dual-excitation spatial-temporal graph convolution method. The method adopts a pyramid aggregation structure formed through group convolution, resulting in a pyramid channel-split graph convolution module. The objective is to integrate context information of different scales by splitting channels, facilitating the interaction of information with different dimensions between channels, and establishing dependencies between channels. Subsequently, a motion excitation module is introduced, which activates motion-sensitive channels by grouping feature channels and calculating feature differences across the temporal dimension. This approach forces the model to focus on discriminative features with motion changes. Additionally, a dual attention mechanism is proposed to highlight key joints and body parts within the overall skeleton action sequence, leading to a more interpretable representation for diverse action sequences. On the NTU RGB+D 60 dataset, the accuracy of X-Sub and X-View reaches 91.6% and 96.9%, respectively. On the NTU RGB+D 120 dataset, the accuracy for X-Sub and X-Set is 87.5% and 88.5%, respectively, outperforming other methods and highlighting the effectiveness of the proposed approach in this study.

Keywords:

Discriminative model Artificial intelligence RGB color model Computer science Pattern recognition (psychology) Concatenation (mathematics) Context (archaeology) Graph Convolution (computer science) Feature (linguistics) Convolutional neural network Feature extraction Computer vision Mathematics Theoretical computer science Artificial neural network

Metrics

Cited By

2.65

FWCI (Field Weighted Citation Impact)

Refs

0.82

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Gait Recognition and Analysis

Physical Sciences → Engineering → Biomedical Engineering

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Dual-Excitation Spatial–Temporal Graph Convolution Network for Skeleton-Based Action Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-Branch Spatial-Temporal Attention Graph Convolution Network for Skeleton-based Action Recognition

Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition

Temporal‐enhanced graph convolution network for skeleton‐based action recognition

Dynamic Semantic-Based Spatial-Temporal Graph Convolution Network for Skeleton-Based Human Action Recognition

Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition