Applying reinforcement learning (RL) algorithms to control systems design remains a challenging task due to the potential unsafe exploration and the low sample efficiency. In this paper, we propose a novel safe model-based RL algorithm to solve the collision-free model-reference trajectory tracking problem of uncertain autonomous vehicles (AVs). Firstly, a new type of robust control barrier function (CBF) condition for collision-avoidance is derived for the uncertain AVs by incorporating the estimation of the system uncertainty with Gaussian process (GP) regression. Then, a robust CBF-based RL control structure is proposed, where the nominal control input is composed of the RL policy and a model-based reference control policy. The actual control input obtained from the quadratic programming problem can satisfy the constraints of collision-avoidance, input saturation and velocity boundedness simultaneously with a relatively high probability. Finally, within this control structure, a Dyna-style safe model-based RL algorithm is proposed, where the safe exploration is achieved through executing the robust CBF-based actions and the sample efficiency is improved by leveraging the GP models. The superior learning performance of the proposed RL control structure is demonstrated through simulation experiments.
G. Geetha RamaniC. KarthikB. PranayD. PramodhB. Karthik Reddy
András MihályVũ Văn TấnTrong TuPéter Gáspár
Qingrui ZhangWei PanVasso Reppa
Qingrui ZhangWei PanVasso Reppa
C. LeTrần Nguyên NgọcVinh-Hao Nguyen