Rockson AgyemanMuhammad RafiqGyu Sang Choi
This paper presents a deep learning approach to summarizing long soccer videos by leveraging the spatiotemporal learning capability of three-dimensional Convolutional Neural Network (3D-CNN) and Long Short-Term Memory (LSTM) – Recurrent Neural Network (RNN). Our proposed approach involves, 1) a step-by-step development of a Residual Network (ResNet) based 3D-CNN that recognizes soccer actions, 2) manually annotating 744 soccer clips from five soccer action classes for training, and 3) training an LSTM network on soccer features extracted by the proposed ResNet based 3D-CNN. We combine the 3D-CNN and LSTM models to detect soccer highlights. To summarize a soccer match video, we model the video input as a sequential concatenation of video segments whose inclusion in a summary video production is based on its validated relevance. To evaluate the proposed summarization system, 10 soccer videos were summarized and subsequently evaluated by 48 participants polled from 8 countries using the Mean Opinion Score (MOS) scale. Collectively, the summarized videos received a 4 of 5 MOS.
Sonia KhetarpaulLakshay JainKush GoyalP. Vishnu Tej
Rahul S BhatO JayanthPawan Prasad PPhani Kumar VedurumudiK N Divyaprabha
Nishit AnandRupesh Kumar KoshariyaVarsha Garg