JOURNAL ARTICLE

SwinAnomaly: Real-Time Video Anomaly Detection Using Video Swin Transformer and SORT

Abstract

Detecting anomalous events in videos is a challenging task due to their infrequent and unpredictable nature in real-world scenarios. In this paper, we propose SwinAnomaly, a video anomaly detection approach based on a conditional GAN-based autoencoder with feature extractors based on Swin Transformers. Our approach encodes spatiotemporal features from a sequence of video frames using a 3D encoder and upsamples them to predict a future frame using a 2D decoder. We utilize patch-wise mean squared error and Simple Online and Real-time Tracking (SORT) for real-time anomaly detection and tracking. Our approach outperforms existing prediction-based video anomaly detection methods and offers flexibility in localizing anomalies through several parameters. Extensive testing shows that SwinAnomaly achieves state-of-the-art performance on public benchmarks, demonstrating the effectiveness of our approach for real-world video anomaly detection. Furthermore, our proposed approach has the potential to enhance public safety and security in various applications, including crowd surveillance, traffic monitoring, and industrial safety.

Keywords:
Anomaly detection Autoencoder Video tracking Encoder Feature (linguistics) Frame (networking) Pattern recognition (psychology) sort

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.37
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.