JOURNAL ARTICLE

Stochastic Multi-Scale Aggregation Network for Crowd Counting

Abstract

Crowd counting from unconstrained and congested scenes is an important task in computer vision. Its main difficulties stem from large scale/density variation and prone to over-fitting. This paper presents a novel end-to-end stochastic multi-scale aggregation network (SMANet) which carefully addresses these issues. Specifically, general features are first extracted by the front-end subnetwork and then fed into the back-end subnetwork which consists of stochastic multi-scale aggregation module, density map generator, and global prior encoder. The stochastic aggregation impels the multi-branch units to learn features at different scales effectively and reduces sensitivity to scale variations, whereas the global prior encoder is designed to encode global contextual information and guarantee density consistency of shared representations. Our proposed SMANet is the first work to fuse multi-scale features in a stochastic manner for crowd counting. Experimental results on four public datasets demonstrate that our SMANet consistently outperforms the state-of-the-arts.

Keywords:
Subnetwork Computer science Scale (ratio) Encoder ENCODE Fuse (electrical) Generator (circuit theory) Artificial intelligence Data mining Theoretical computer science Power (physics) Engineering

Metrics

10
Cited By
0.73
FWCI (Field Weighted Citation Impact)
33
Refs
0.71
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.