JOURNAL ARTICLE

Recurrent Fine-Grained Self-Attention Network for Video Crowd Counting

Abstract

Striking a balance between exploring the spatio-temporal correlation and controlling model complexity is vital for video-based crowd counting methods. In this paper, we propose a Recurrent Fine-Grained Self-Attention Network (RFSNet) to achieve efficient and accurate counting in video scenes via the self-attention mechanism and a recurrent fine-tuning strategy. Specifically, we design a decoder which consists of patch-wise spatial self-attention and temporal self-attention. Compared with vanilla self-attention, it effectively leverages the dependencies in spatial and temporal domain respectively, while significantly reducing computational complexity. Moreover, the RFSNet recurrently feeds the features into the decoder to enhance the spatio-temporal representations. This strategy not only simplifies the model structure and reduces the number of parameters, but also improves the quality of estimated density maps. Our RFSNet achieves state-of-the-art performance on three video crowd counting benchmarks, and outperforms other methods by more than 20% on the challenging FDST dataset.

Keywords:
Computer science Artificial intelligence Domain (mathematical analysis) Video quality Decoding methods Computational complexity theory State (computer science) Machine learning Algorithm

Metrics

1
Cited By
0.18
FWCI (Field Weighted Citation Impact)
39
Refs
0.38
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Fine-Grained Crowd Counting

Jia WanNikil Senthil KumarAntoni B. Chan

Journal:   IEEE Transactions on Image Processing Year: 2021 Vol: 30 Pages: 2114-2126
BOOK-CHAPTER

FGENet: Fine-Grained Extraction Network for Congested Crowd Counting

Haoyuan MaLi ZhangXiang-Yi Wei

Lecture notes in computer science Year: 2024 Pages: 43-56
JOURNAL ARTICLE

Crowd Counting Network with Self-attention Distillation

Yaoyao LiLi WangHuailin ZhaoZhen Nie

Journal:   Journal of Robotics Networking and Artificial Life Year: 2020 Vol: 7 (2)Pages: 116-116
JOURNAL ARTICLE

Crowd Counting Network with Self-attention Distillation

Wang LiHuailin ZhaoZhen NieYaoyao Li

Journal:   Proceedings of International Conference on Artificial Life and Robotics Year: 2020 Vol: 25 Pages: 587-591
JOURNAL ARTICLE

Frame-Recurrent Video Crowd Counting

Yi HouShanghang ZhangRui MaHuizhu JiaXiaodong Xie

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 33 (9)Pages: 5186-5199
© 2026 ScienceGate Book Chapters — All rights reserved.