JOURNAL ARTICLE

Multimodal-Based and Aesthetic-Guided Narrative Video Summarization

Jiehang XieXuanbai ChenTianyi ZhangYixuan ZhangShao-Ping LuPablo CésarYulu Yang

Year: 2022 Journal:   IEEE Transactions on Multimedia Vol: 25 Pages: 4894-4908   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Narrative videos usually illustrate the main content through multiple narrative information such as audios, video frames and subtitles. Existing video summarization approaches rarely consider the multiple dimensional narrative inputs, or ignore the impact of shots artistic assembly when directly applied to narrative videos. This paper introduces a multimodal-based and aesthetic-guided narrative video summarization method. Our method leverages multimodal information including visual content, subtitles and audio information through our specified key shots selection, subtitle summarization, and highlight extraction components. Furthermore, under the guidance of cinematographic aesthetic, we design a novel shots assembly module to ensure the shot content completeness and then assemble the selected shots into a desired summary. Besides, our method also provides the flexible specification for shots selection, to achieve which it automatically selects semantically related shots according to the user-designed text. By conducting a large number of quantitative experimental evaluations and user studies, we demonstrate that our method effectively preserves important narrative information of the original video, and it is capable of rapidly producing high-quality and aesthetic-guided narrative video summaries.

Keywords:
Automatic summarization Computer science Narrative Selection (genetic algorithm) Multimedia Shot (pellet) Information retrieval Key (lock) Artificial intelligence Human–computer interaction Linguistics

Metrics

17
Cited By
2.10
FWCI (Field Weighted Citation Impact)
67
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

GraphVSum:graph guided multimodal video summarization

Zhengyu ZhaoCong BaiPengyi Hao

Journal:   Multimedia Systems Year: 2025 Vol: 32 (1)
JOURNAL ARTICLE

Video Summarization Based on Multimodal Features

Yu ZhangJu LiuXiaoxi LiuXuesong Gao

Journal:   International Journal of Multimedia Data Engineering and Management Year: 2020 Vol: 11 (4)Pages: 60-76
JOURNAL ARTICLE

AESTHETIC DRIVEN VIDEO SUMMARIZATION

T ShruthiR RahulM Neha

Journal:   INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT Year: 2024 Vol: 08 (12)Pages: 1-7
© 2026 ScienceGate Book Chapters — All rights reserved.