This paper proposes a map-free navigation scheme based on continuous deep reinforcement learning to solve the problem that robots cannot flexibly avoid obstacles and navigate in a dynamic environment. The reinforcement learning algorithm used in this article is near-end strategy optimization (proximal strategy optimization, PPO), and the benchmark algorithm is the discrete deep reinforcement learning algorithm Deep Q network algorithm (Deep Q network, DQN).Experiments in the Gazebo simulation environment prove that the training efficiency and success rate of the PPO algorithm are much higher than that of the DQN algorithm. In this paper, the trained strategy model in the simulation environment is directly transplanted to the actual robot. The experimental results verify that the physical robot can have good navigation and obstacle avoidance capabilities without training again. The tested single-target navigation success rate is 80%, and the multi-target navigation success rate is 70%.
Nanxun DuoQinzhao WangQiang LvHeng WeiPei Zhang
Enrico MarchesiniAlessandro Farinelli
Jiajun WuWeihao ChenJiaming JiXing ChenLumei SuHoude Dai
Wenxing LiuHanlin NiuIpek CaliskanelliZhengjia XuRobert Skilton