Public places where crowds gather are considered hotspots for abnormal incidents. Rapid detection of abnormal behavior through video surveillance is crucial for maintaining public safety. Surveillance videos in such scenarios often face challenges, including distant camera angles, poor image quality, and occlusions caused by crowd congestion. This study proposes the transfer of the Deep filter bank of pretrained VGG on ImageNet for detecting abnormal behavior in crowded scenes. This approach leverages the powerful modeling and generalization capabilities of Deep convolutional neural networks (CNN) while reducing complexity in model training and computation. Additionally, the original fully connected layer of the CNN is replaced with a Fisher kernel encoder, which effectively captures the crowd texture features extracted by the CNN. Finally, a support vector machine (SVM) is employed for classifying normal and abnormal behaviors. Through parameter optimizations on well-known public datasets, the proposed method achieves a recognition accuracy of 94.3%. In comparison to several existing classical methods, this approach demonstrates advantages in terms of recognition accuracy and computational efficiency.
Aybars ToktaAlı Köksal Hocaoǧlu
Shaoci XieXiaohong ZhangJing Cai
Xuguang ZhangQian ZhangShuo HuChunsheng GuoHui Yu