JOURNAL ARTICLE

Video Summarization with Self-Attention Based Encoder-Decoder Framework

Abstract

This paper proposes an efficient supervised video summarization algorithm with self-attention based encoder-decoder network. Given an input video, we implement a Bi-GRU network to encode the contextual information of the video frames using self-attention mechanism, and a GRU network as the decoder, accompanying with a regression network to predict the importance score of every video frame. Experiments and analysis are conducted on the public benchmark datasets TvSum and SumMe, the results validate the superiority of our algorithm.

Keywords:
Automatic summarization Computer science Encoder Benchmark (surveying) ENCODE Frame (networking) Decoding methods Artificial intelligence Encoding (memory) Real-time computing Computer network Algorithm

Metrics

6
Cited By
0.21
FWCI (Field Weighted Citation Impact)
27
Refs
0.51
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Multimedia Communication and Technology
Social Sciences →  Social Sciences →  Sociology and Political Science

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.