A dense crowd detection algorithm for YOLOv5 based on CA (Coordinate attention) mechanism is proposed for the problem of low recognition accuracy and high miss detection rate caused by many targets and serious occlusion in dense crowd detection. Firstly, the collected images are processed for data enhancement; secondly, the depth separable convolution be used instead of normal convolution of backbone network. It effectively reduces the complexity and number of participants of the model, while CA (Coordinate attention) attention mechanism with location information is used to obtain the effective feature layer image width and height for effective feature fusion so that the model can more accurately. Finally, the GIoU loss is replaced by the $\text{SIoU}$ loss function towards raise training speed and accuracy of inference. Experimental results show that compared to traditional YOLOv5 network, the average accuracy AP of the improved network model is improved by 3.9%, effectively improving the recognition accuracy of dense crowds.