This paper describes a scene invariant crowd counting algorithm that uses local features to monitor crowd size. Unlike previous algorithms that require each camera to be trained separately, the proposed method uses camera calibration to scale between viewpoints, allowing a system to be trained and tested on different scenes. A pre-trained system could therefore be used as a turn-key solution for crowd counting across a wide range of environments. The use of local features allows the proposed algorithm to calculate local occupancy statistics, and Gaussian process regression is used to scale to conditions which are unseen in the training data, also providing confidence intervals for the crowd size estimate. A new crowd counting database is introduced to the computer vision community to enable a wider evaluation over multiple scenes, and the proposed algorithm is tested on seven datasets to demonstrate scene invariance and high accuracy. To the authors' knowledge this is the first system of its kind due to its ability to scale between different scenes and viewpoints.
David RyanSimon DenmanClinton FookesSridha Sridharan
David RyanSimon DenmanSridha SridharanClinton Fookes
David RyanSimon DenmanClinton FookesSridha Sridharan
Huaiming LiFei WangFangfang SongLianqing Wang