Most of the existing methods of human pose estimation methods focus on improving the accuracy of prediction results, but due to network parameters and high computational complexity, a great deal of computing resources are needed. In this paper, a lightweight human pose estimation method based on high-resolution network is proposed. The bottleneck module and the basic module in the high-resolution network are redesigned by using the depth separable convolution instead of the ordinary convolution and integrating attention mechanism, which ensures the accuracy of the network and greatly reduces the number of parameters and computational complexity of the model. Experimental results on the COCO VAL2017 dataset show an 84.5% reduction in the number of model parameters, a 73.9%% reduction in computational complexity, and 0.2% increase in the accuracy of human keypoint detection compared to the high-resolution network.
LIU Shengjie, HE Ning, WANG Xin, YU Haigang, HAN Wenjing
Xiaofang MuShuxian GuoHong ShiMingxing HouMinghui SongYiming WuZijian Wang
Sai MaHaibo GeWenhao HeChaofeng HuangYu AnTing Zhou