Human Action Recognition Using Multi-Stream Fusion and Hybrid Deep Neural Networks

Saurabh Chopra; Li Zhang; Ming Jiang

doi:10.1109/smc53992.2023.10393912

ScienceGate Book Chapters

JOURNAL ARTICLE

Human Action Recognition Using Multi-Stream Fusion and Hybrid Deep Neural Networks

Saurabh Chopra Li Zhang Ming Jiang

Year: 2023 Pages: 4852-4858

DOI: 10.1109/smc53992.2023.10393912

Get Full-Text PDF Get Analytical Report

Abstract

Action Recognition in videos is a topic of interest in the area of computer vision, due to potential applications such as multimedia indexing and surveillance in public areas. In this research, we first propose spatial and temporal Convolutional Neural Network (CNNs), based on transfer learning using ResNetl0l, GoogleNet and VGG16, for undertaking human action recognition. Besides that, hybrid networks such as CNN-Recurrent Neural Network (RNN) models are also exploited as encoder-decoder architectures for video action classification. In particular, different types of RNNs such as Long Short-Term Memory (LSTM), Bidirectional-LSTM (BiLSTM), Gated Recurrent Unit (GRU), and Bidirectional-GRU (BiGRU), are exploited as the decoders for action recognition. To further enhance performance, diverse aggregation networks of CNN and CNN-RNN models are implemented. Specifically, an Average Fusion method is used to integrate spatial and temporal CNN s trained on images, as well as CNN - RNN trained on videos, where the final classification is formed by combining Softmax scores of these models via a late fusion. A total of 22 models (1 motion CNN, 3 spatial CNNs, 12 CNN-RNNs and 6 fusion networks) are implemented which are evaluated using UCF11, UCFSO, and UCF10l datasets for performance comparison. The empirical results indicate the significant efficiency of Average Fusion of multiple Spatial-CNNs with one Motion-CNN, and ResNet101-BiGRU, among all the networks for undertaking realistic video action recognition.

Keywords:

Computer science Artificial intelligence Convolutional neural network Recurrent neural network Deep learning Softmax function Pattern recognition (psychology) Encoder Action recognition Computer vision Artificial neural network

Metrics

Cited By

1.09

FWCI (Field Weighted Citation Impact)

Refs

0.75

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Gait Recognition and Analysis

Physical Sciences → Engineering → Biomedical Engineering

Human Action Recognition Using Multi-Stream Fusion and Hybrid Deep Neural Networks

Abstract

Metrics

Citation History

Topics

Related Documents

Human Action Recognition Using Hybrid Deep Evolving Neural Networks

Human Action Recognition Using Deep Neural Networks

Multi-stream with Deep Convolutional Neural Networks for Human Action Recognition in Videos

Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition

Action Recognition in Videos Using Multi-stream Convolutional Neural Networks