JOURNAL ARTICLE

Spatio-temporal Collaborative Convolution for Video Action Recognition

Abstract

Although video action recognition has achieved great progress in recent years, it is still a challenging task due to the huge computational complexity. Designing a lightweight network is a feasible solution, but it may reduce the spatio-temporal information modeling capability. In this paper, we propose a novel novel spatio-temporal collaborative convolution (denote as "STC-Conv"), which can efficiently encode spatio-temporal information. STC-Conv collaboratively learn spatial and temporal feature in one convolution filter kernel. In short, temporal convolution and spatial convolution are integrated in the one STC convolution kernel, which can effectively reduce the model complexity and improve the computational efficiency. STC-Conv is a universal convolution, which can be applied to the existing 2D CNNs, such as ResNet, DenseNet. The experimental results on the temporal-related dataset Something Something V1 prove the superiority of our method. Noticeably, STC-Conv enjoys more excellent performance than 3D CNNs at even lower computation cost than standard 2D CNNs.

Keywords:
Convolution (computer science) Kernel (algebra) Computer science Computation ENCODE Computational complexity theory Action recognition Artificial intelligence Filter (signal processing) Convolutional neural network Pattern recognition (psychology) Feature (linguistics) Algorithm Computer vision Mathematics Artificial neural network

Metrics

3
Cited By
0.31
FWCI (Field Weighted Citation Impact)
17
Refs
0.56
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Gait Recognition and Analysis
Physical Sciences →  Engineering →  Biomedical Engineering
Diabetic Foot Ulcer Assessment and Management
Health Sciences →  Medicine →  Endocrinology, Diabetes and Metabolism

Related Documents

JOURNAL ARTICLE

A Spatio-Temporal Attention Convolution Block for Action Recognition

Junjie WangXueyan Wen

Journal:   Journal of Physics Conference Series Year: 2020 Vol: 1651 (1)Pages: 012193-012193
JOURNAL ARTICLE

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

Chaolong LiZhen CuiWenming ZhengChunyan XuJian Yang

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2018 Vol: 32 (1)
JOURNAL ARTICLE

Spatio-Temporal Collaborative Module for Efficient Action Recognition

Yanbin HaoShuo WangYi TanXiangnan HeZhenguang LiuMeng Wang

Journal:   IEEE Transactions on Image Processing Year: 2022 Vol: 31 Pages: 7279-7291
© 2026 ScienceGate Book Chapters — All rights reserved.