Video Summarization with Self-Attention Based Encoder-Decoder Framework

Xuming Feng; Lei Wang; Yaping Zhu

doi:10.1109/iccst50977.2020.00046

ScienceGate Book Chapters

JOURNAL ARTICLE

Video Summarization with Self-Attention Based Encoder-Decoder Framework

Xuming Feng Lei Wang Yaping Zhu

Year: 2020 Pages: 208-214

DOI: 10.1109/iccst50977.2020.00046

Get Full-Text PDF Get Analytical Report

Abstract

This paper proposes an efficient supervised video summarization algorithm with self-attention based encoder-decoder network. Given an input video, we implement a Bi-GRU network to encode the contextual information of the video frames using self-attention mechanism, and a GRU network as the decoder, accompanying with a regression network to predict the importance score of every video frame. Experiments and analysis are conducted on the public benchmark datasets TvSum and SumMe, the results validate the superiority of our algorithm.

Keywords:

Automatic summarization Computer science Encoder Benchmark (surveying) ENCODE Frame (networking) Decoding methods Artificial intelligence Encoding (memory) Real-time computing Computer network Algorithm

Metrics

Cited By

0.21

FWCI (Field Weighted Citation Impact)

Refs

0.51

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Video Analysis and Summarization

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Multimedia Communication and Technology

Social Sciences → Social Sciences → Sociology and Political Science

Video Summarization with Self-Attention Based Encoder-Decoder Framework

Abstract

Metrics

Citation History

Topics

Related Documents

Video Summarization With Attention-Based Encoder–Decoder Networks

Video Summarization with Encoder-Decoder Based Graph-Attention Networks

Effective Video Summarization Using Channel Attention-Assisted Encoder–Decoder Framework

Unsupervised Video Summarization Based on An Encoder-Decoder Architecture

Dense Video Captioning with Hierarchical Attention-Based Encoder-Decoder Networks