JOURNAL ARTICLE

Attention-Based Deep Reinforcement Learning for Virtual Cinematography of 360$^{\circ}$ Videos

Jianyi WangMai XuLai JiangYuhang Song

Year: 2020 Journal:   IEEE Transactions on Multimedia Vol: 23 Pages: 3227-3238   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Virtual cinematography refers to automatically selecting a natural-looking normal field-of-view (NFOV) from an entire 360 $^{\circ}$ video. In fact, virtual cinematography can be modeled as a deep reinforcement learning (DRL) problem, in which an agent makes actions related to NFOV selection according to the environment of 360 $^{\circ}$ video frames. More importantly, we find from our data analysis that the selected NFOVs attract significantly more attention than other regions, i.e., the NFOVs have high saliency. Therefore, in this paper, we propose an attention-based DRL (A-DRL) approach for virtual cinematography in 360 $^{\circ}$ video. Specifically, we develop a new DRL framework for automatic NFOV selection with the input of both the content, and saliency map of each 360 $^{\circ}$ frame. Then, we propose a new reward function for the DRL framework in our approach, which considers the saliency values, ground-truth, and smooth transition for NFOV selection. Subsequently, a simplified DenseNet (called Mini-DenseNet) is designed to learn the optimal policy via maximizing the reward. Based on the learned policy, the actions of NFOV can be made in our A-DRL approach for virtual cinematography of 360 $^{\circ}$ video. Extensive experiments show that our A-DRL approach outperforms other state-of-the-art virtual cinematography methods, over the datasets of Sports-360 video, and Pano2Vid.

Keywords:
Notation Computer science Selection (genetic algorithm) Artificial intelligence Cinematography Frame (networking) Algorithm Mathematics Arithmetic Physics

Metrics

8
Cited By
0.63
FWCI (Field Weighted Citation Impact)
69
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.