JOURNAL ARTICLE

Integrating Temporal and Spatial Attention for Video Action Recognition

Yuanding ZhouBaopu LiZhihui WangHaojie Li

Year: 2022 Journal:   Security and Communication Networks Vol: 2022 Pages: 1-8   Publisher: Hindawi Publishing Corporation

Abstract

In recent years, deep convolutional neural networks (DCNN) have been widely used in the field of video action recognition. Attention mechanisms are also increasingly utilized in action recognition tasks. In this paper, we want to combine temporal and spatial attention for better video action recognition. Specifically, we learn a set of sparse attention by computing class response maps for finding the most informative region in a video frame. Each video frame is resampled with this information to form two new frames, one focusing on the most discriminative regions of the image and the other on the complementary regions of the image. After computing sparse attention all the newly generated video frames are rearranged in the order of the original video to form two new videos. These two videos are then fed into a CNN as new inputs to reinforce the learning of discriminative regions in the images (spatial attention). And the CNN we used is a network with a frame selection strategy that allows the network to focus on only some of the frames to complete the classification task (temporal attention). Finally, we combine the three video (original, discriminative, and complementary) classification results to get the final result together. Our experiments on the datasets UCF101 and HMDB51 show that our approach outperforms the best available methods.

Keywords:
Discriminative model Computer science Artificial intelligence Convolutional neural network Frame (networking) Pattern recognition (psychology) Set (abstract data type) Focus (optics) Task (project management) Class (philosophy) Computer vision

Metrics

3
Cited By
0.37
FWCI (Field Weighted Citation Impact)
31
Refs
0.52
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Diabetic Foot Ulcer Assessment and Management
Health Sciences →  Medicine →  Endocrinology, Diabetes and Metabolism
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.