Mengru FengJiangjun HuMinghui OuDongchun Li
With the improvement of our modernization level and the development of urbanization, the phenomenon of large crowds gathering is very common, which brings great hidden danger to public safety. Crowd counting plays a very important role in the field of intelligent video surveillance. It can estimate the total number of people according to the image and provide real-time warning, which can effectively avoid the occurrence of safety accidents. In this paper, in order to build an efficient lightweight crowd density estimation model, we proposed a dilated and depthwise separable convolution network. In the network, we build a dilated and depthwise separable convolution module, which improves the receptive field of the convolution kernel without increasing network parameters. Meanwhile, sparse convolutional kernels are used to extract sparse features, and compact convolutional kernels are used to extract more dense features, thus improving the efficiency of the convolutional kernels. Through experimental comparison on SHT Part A data set, the accuracy of the proposed method is 2.6% higher than that of MCNN, and the parameters are 7.6% less.
Dengguo YaoYuanping XuChaolong ZhangZhijie XuJian HuangBenjun Guo
Wei SunXijie ZhouXiaorui ZhangXiaozheng He
Shaowei PanXingxing ChengWenjing Fan
Zhen LiZhibiao ZhaoYulang HeYan ShiQi Zhou