Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning

Jaime F. Fisac; Neil F. Lugovoy; Vicenç Rubies-Royo; Shromona Ghosh; Claire J. Tomlin

doi:10.1109/icra.2019.8794107

ScienceGate Book Chapters

JOURNAL ARTICLE

Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning

Jaime F. Fisac Neil F. Lugovoy Vicenç Rubies-Royo Shromona Ghosh Claire J. Tomlin

Year: 2019 Pages: 8550-8556

DOI: 10.1109/icra.2019.8794107

Get Full-Text PDF Get Analytical Report

Abstract

Safety analysis is a necessary component in the design and deployment of autonomous robotic systems. Techniques from robust optimal control theory, such as Hamilton-Jacobi reachability analysis, allow a rigorous formalization of safety as guaranteed constraint satisfaction. Unfortunately, the computational complexity of these tools for general dynamical systems scales poorly with state dimension, making existing tools impractical beyond small problems. Modern reinforcement learning methods have shown promising ability to find approximate yet proficient solutions to optimal control problems in complex and high-dimensional systems, however their application has in practice been restricted to problems with an additive payoff over time, unsuitable for reasoning about safety. In recent work, we introduced a time-discounted modification of the problem of maximizing the minimum payoff over time, central to safety analysis, through a modified dynamic programming equation that induces a contraction mapping. Here, we show how a similar contraction mapping can render reinforcement learning techniques amenable to quantitative safety analysis as tools to approximate the safe set and optimal safety policy. This opens a new avenue of research connecting control-theoretic safety analysis and the reinforcement learning domain. We validate the correctness of our formulation by comparing safety results computed through Q-learning to analytic and numerical solutions, and demonstrate its scalability by learning safe sets and control policies for simulated systems of up to 18 state dimensions using value learning and policy gradient techniques.

Keywords:

Reinforcement learning Correctness Computer science Reachability Stochastic game Scalability Dynamic programming Bellman equation Mathematical optimization Artificial intelligence Theoretical computer science Mathematics Algorithm

Metrics

Cited By

5.07

FWCI (Field Weighted Citation Impact)

Refs

0.96

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Fuel Cells and Related Materials

Physical Sciences → Engineering → Electrical and Electronic Engineering

Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey

Safe Multi-Agent Reinforcement Learning via Approximate Hamilton-Jacobi Reachability

On Safety and Liveness Filtering Using Hamilton–Jacobi Reachability Analysis

Ensuring safety for vehicle parking tasks using Hamilton-Jacobi reachability analysis

Exact and efficient Hamilton-Jacobi guaranteed safety analysis via system decomposition