Abstract

Crowd counting is receiving rapidly growing research interests due to its potential application value in numerous real-world scenarios. However, due to various challenges such as occlusion, insufficient resolution and dynamic backgrounds, crowd counting remains an unsolved problem in computer vision. Density estimation is a popular strategy for crowd counting, where conventional density estimation methods perform pixel-wise regression without explicitly accounting the interdependence of pixels. As a result, independent pixel-wise predictions can be noisy and inconsistent. In order to address such an issue, we propose a Relational Attention Network (RANet) with a self-attention mechanism for capturing interdependence of pixels. The RANet enhances the self-attention mechanism by accounting both short-range and long-range interdependence of pixels, where we respectively denote these implementations as local self-attention (LSA) and global self-attention (GSA). We further introduce a relation module to fuse LSA and GSA to achieve more informative aggregated feature representations. We conduct extensive experiments on four public datasets, including ShanghaiTech A, ShanghaiTech B, UCF-CC-50 and UCF-QNRF. Experimental results on all datasets suggest RANet consistently reduces estimation errors and surpasses the state-of-the-art approaches by large margins.

Keywords:
Pixel Computer science Fuse (electrical) Artificial intelligence Range (aeronautics) Implementation Feature (linguistics) Regression Machine learning Data mining Pattern recognition (psychology) Mathematics Statistics

Metrics

198
Cited By
14.00
FWCI (Field Weighted Citation Impact)
65
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Crowd counting with crowd attention convolutional neural network

Jiwei ChenWen SuZengfu Wang

Journal:   Neurocomputing Year: 2019 Vol: 382 Pages: 210-220
JOURNAL ARTICLE

Jointly attention network for crowd counting

Yuqiang HeYinfeng XiaYizhen WangBaoqun Yin

Journal:   Neurocomputing Year: 2022 Vol: 487 Pages: 157-171
JOURNAL ARTICLE

Crowd Counting Guided by Attention Network

Pei NieCien FanLian ZouLiqiong ChenXiaopeng Li

Journal:   Information Year: 2020 Vol: 11 (12)Pages: 567-567
JOURNAL ARTICLE

Crowd Counting Network with Self-attention Distillation

Yaoyao LiLi WangHuailin ZhaoZhen Nie

Journal:   Journal of Robotics Networking and Artificial Life Year: 2020 Vol: 7 (2)Pages: 116-116
JOURNAL ARTICLE

Spatial-Frequency Attention Network for Crowd Counting

Xiangyu GuoMingliang GaoWenzhe ZhaiJianrun ShangQilei Li

Journal:   Big Data Year: 2022 Vol: 10 (5)Pages: 453-465
© 2026 ScienceGate Book Chapters — All rights reserved.