JOURNAL ARTICLE

SlowFast Multimodality Compensation Fusion Swin Transformer Networks for RGB-D Action Recognition

Xiongjiang XiaoZiliang RenHuan LiWenhong WeiZhiyong YangHuaide Yang

Year: 2023 Journal:   Mathematics Vol: 11 (9)Pages: 2115-2115   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

RGB-D-based technology combines the advantages of RGB and depth sequences which can effectively recognize human actions in different environments. However, the spatio-temporal information between different modalities is difficult to effectively learn from each other. To enhance the information exchange between different modalities, we introduce a SlowFast multimodality compensation block (SFMCB) which is designed to extract compensation features. Concretely, the SFMCB fuses features from two independent pathways with different frame rates into a single convolutional neural network to achieve performance gains for the model. Furthermore, we explore two fusion schemes to combine the feature from two independent pathways with different frame rates. To facilitate the learning of features from independent multiple pathways, multiple loss functions are utilized for joint optimization. To evaluate the effectiveness of our proposed architecture, we conducted experiments on four challenging datasets: NTU RGB+D 60, NTU RGB+D 120, THU-READ, and PKU-MMD. Experimental results demonstrate the effectiveness of our proposed model, which utilizes the SFMCB mechanism to capture complementary features of multimodal inputs.

Keywords:
RGB color model Computer science Multimodality Artificial intelligence Modalities Convolutional neural network Feature (linguistics) Pattern recognition (psychology) Frame (networking)

Metrics

5
Cited By
0.91
FWCI (Field Weighted Citation Impact)
56
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Hand Gesture Recognition Systems
Physical Sciences →  Computer Science →  Human-Computer Interaction
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Swin-Fusion: Swin-Transformer with Feature Fusion for Human Action Recognition

Tiansheng ChenLingfei Mo

Journal:   Neural Processing Letters Year: 2023 Vol: 55 (8)Pages: 11109-11130
JOURNAL ARTICLE

Dual-stream cross-modality fusion transformer for RGB-D action recognition

Zhen LiuJun ChengLibo LiuZiliang RenQieshi ZhangChengqun Song

Journal:   Knowledge-Based Systems Year: 2022 Vol: 255 Pages: 109741-109741
JOURNAL ARTICLE

Cross-Modality Compensation Convolutional Neural Networks for RGB-D Action Recognition

Jun ChengZiliang RenQieshi ZhangXiangyang GaoFusheng Hao

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2021 Vol: 32 (3)Pages: 1498-1509
JOURNAL ARTICLE

Trear: Transformer-Based RGB-D Egocentric Action Recognition

Xiangyu LiYonghong HouPichao WangZhimin GaoMingliang XuWanqing Li

Journal:   IEEE Transactions on Cognitive and Developmental Systems Year: 2021 Vol: 14 (1)Pages: 246-252
© 2026 ScienceGate Book Chapters — All rights reserved.