Robust Safe Reinforcement Learning under Adversarial Disturbances

Zeyang Li; Chuxiong Hu; Shengbo Eben Li; Jia Cheng; Yunan Wang

doi:10.1109/cdc49753.2023.10383375

ScienceGate Book Chapters

JOURNAL ARTICLE

Robust Safe Reinforcement Learning under Adversarial Disturbances

Zeyang Li Chuxiong Hu Shengbo Eben Li Jia Cheng Yunan Wang

Year: 2023 Pages: 334-341

DOI: 10.1109/cdc49753.2023.10383375

Get Full-Text PDF Get Analytical Report

Abstract

Safety is a primary concern when applying reinforcement learning to real-world control tasks, especially in the presence of external disturbances. However, existing safe reinforcement learning algorithms rarely account for external disturbances, limiting their applicability and robustness in practice. To address this challenge, this paper proposes a robust safe reinforcement learning framework that tackles worst-case disturbances. First, this paper presents a policy iteration scheme to solve for the robust invariant set, i.e., a subset of the safe set, where persistent safety is only possible for states within. The key idea is to establish a two-player zero-sum game by leveraging the safety value function in Hamilton-Jacobi reachability analysis, in which the protagonist (i.e., control inputs) aims to maintain safety and the adversary (i.e., external disturbances) tries to break down safety. This paper proves that the proposed policy iteration algorithm converges monotonically to the maximal robust invariant set. Second, this paper integrates the proposed policy iteration scheme into a constrained reinforcement learning algorithm that simultaneously synthesizes the robust invariant set and uses it for constrained policy optimization. This algorithm tackles both optimality and safety, i.e., learning a policy that attains high rewards while maintaining safety under worst-case disturbances. Experiments on classic control tasks show that the proposed method achieves zero constraint violation with learned worst-case adversarial disturbances, while other baseline algorithms violate the safety constraints substantially. Our proposed method also attains comparable performance as the baselines even in the absence of the adversary.

Keywords:

Reinforcement learning Adversarial system Reachability Computer science Robustness (evolution) Mathematical optimization Bellman equation Invariant (physics) Adversary Set (abstract data type) Robust control Monotonic function Artificial intelligence Algorithm Mathematics Control system Engineering Computer security

Metrics

Cited By

0.51

FWCI (Field Weighted Citation Impact)

Refs

0.69

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Adaptive Dynamic Programming Control

Physical Sciences → Computer Science → Computational Theory and Mathematics

Robust Safe Reinforcement Learning under Adversarial Disturbances

Abstract

Metrics

Citation History

Topics

Related Documents

Robust Adversarial Reinforcement Learning

Robust Inverse Reinforcement Learning Under State Adversarial Perturbations

Robust Proximal Adversarial Reinforcement Learning Under Model Mismatch

Robust Adversarial Deep Reinforcement Learning

Active Robust Adversarial Reinforcement Learning Under Temporally Coupled Perturbations