The aim of video summarization is to create a short summary video which captures the essence of the original video and contains all the important events of the original video. This is helpful because, now we don't have to go through the entire video and are able to get a gist of it from just a short summary video. Current Supervised learning video summarization methods, use Convolutional Neural Networks and some supervised learning techniques use Recurrent Neural Networks in addition to them. We propose VidSum, an architecture for Video Summarization using Deep Learning. We combine Long Short Term Memory (LSTM) Networks with Convolutional Neural Networks to solve the problem of Video Summarization. Our deep learning model is able to find the temporal importance of video frames and is able to generate video summaries which are temporally coherent and contain the important parts of a video clip. In our testing, our model outperforms other models on the famous TVSum and SumMe datasets for the task of Video Summarization.
Sonia KhetarpaulLakshay JainKush GoyalP. Vishnu Tej
Rahul S BhatO JayanthPawan Prasad PPhani Kumar VedurumudiK N Divyaprabha
Rockson AgyemanMuhammad RafiqGyu Sang Choi