Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition

Sibo Song; Vijay Chandrasekhar; Bappaditya Mandal; Liyuan Li; Joo‐Hwee Lim; G. S. D. Babu; Phyo Phyo San; Ngai‐Man Cheung

doi:10.1109/cvprw.2016.54

ScienceGate Book Chapters

JOURNAL ARTICLE

Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition

Sibo Song Vijay Chandrasekhar Bappaditya Mandal Liyuan Li Joo‐Hwee Lim G. S. D. Babu Phyo Phyo San Ngai‐Man Cheung

Year: 2016 Pages: 378-385

DOI: 10.1109/cvprw.2016.54

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we propose a multimodal multi-stream deep learning framework to tackle the egocentric activity recognition problem, using both the video and sensor data. First, we experiment and extend a multi-stream Convolutional Neural Network to learn the spatial and temporal features from egocentric videos. Second, we propose a multistream Long Short-Term Memory architecture to learn the features from multiple sensor streams (accelerometer, gyroscope, etc.). Third, we propose to use a two-level fusion technique and experiment different pooling techniques to compute the prediction results. Experimental results using a multimodal egocentric dataset show that our proposed method can achieve very encouraging performance, despite the constraint that the scale of the existing egocentric datasets is still quite limited.

Keywords:

Computer science Pooling Artificial intelligence Convolutional neural network Deep learning Activity recognition Constraint (computer-aided design) Sensor fusion Pattern recognition (psychology) Machine learning Computer vision

Metrics

Cited By

5.02

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Context-Aware Activity Recognition Systems

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Gait Recognition and Analysis

Physical Sciences → Engineering → Biomedical Engineering

Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Multimodal Egocentric Activity Recognition Using Multi-stream CNN

Knowledge-driven Egocentric Multimodal Activity Recognition

Few-shot Egocentric Multimodal Activity Recognition

Multi-modal egocentric activity recognition using multi-kernel learning

Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition