Feiqiang LinZe JiChangyun WeiRaphael Grech
Most reinforcement learning (RL)-based works for mapless point goal navigation tasks assume the availability of the robot ground-truth poses, which is unrealistic for real world applications. In this work, we remove such an assumption and deploy observation-based localisation algorithms, such as Lidar-based or visual odometry, for robot self-pose estimation. These algorithms, despite having widely achieved promising performance and being robust to various harsh environments, may fail to track robot locations under many scenarios, where observations perceived along robot trajectories are insufficient or ambiguous. Hence, using such localisation algorithms will introduce new unstudied problems for mapless navigation tasks. This work will propose a new RL-based algorithm, with which robots learn to navigate in a way that prevents localisation failures or getting trapped in local minimum regions. This ability can be learned by deploying two techniques suggested in this work: a reward metric to decide punishment on behaviours resulting in localisation failures; and a reconfigured state representation that consists of current observation and history trajectory information to transfer the problem from a partially observable Markov decision process (POMDP) to a Markov Decision Process (MDP) model to avoid local minimum.
Feiqiang LinZe JiChangyun WeiHanlin Niu
Shaohua LvYanjie LiQi LiuJianqi GaoXizheng PangMeiling Chen
Jianqi GaoXizheng PangQi LiuYanjie Li
Luca MarzariDavide CorsiEnrico MarchesiniAlessandro Farinelli