Correct-by-synthesis reinforcement learning with temporal logic constraints

Min Wen; Rüdiger Ehlers; Ufuk Topcu

doi:10.1109/iros.2015.7354078

ScienceGate Book Chapters

JOURNAL ARTICLE

Correct-by-synthesis reinforcement learning with temporal logic constraints

Min Wen Rüdiger Ehlers Ufuk Topcu

Year: 2015 Pages: 4983-4990

DOI: 10.1109/iros.2015.7354078

Get Full-Text PDF Get Analytical Report

Abstract

We consider a problem on the synthesis of optimal reactive controllers with an a priori unknown performance criterion while satisfying a given temporal logic specification through the interaction with an uncontrolled environment. We decouple the problem into two sub-problems. First, we extract a (maximally) permissive strategy for the system, which encodes multiple (possibly all) ways in which the system can react to the adversarial environment and satisfy the specifications. Then, we quantify the a priori unknown performance criterion as a (still unknown) reward function, and compute - by using the so-called maximin-Q learning algorithm - an optimal strategy for the system within the operating envelope allowed by the permissive strategy. We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications. For specifications beyond this fragment, correctness can still be preserved, but the learned strategy may be sub-optimal. We present an algorithm to the overall problem, and demonstrate its use and computational requirements on a set of robot motion planning examples.

Keywords:

Correctness Temporal logic A priori and a posteriori Computer science Reinforcement learning Minimax Set (abstract data type) Fragment (logic) Robot Linear temporal logic Mathematical optimization Artificial intelligence Algorithm Theoretical computer science Mathematics

Metrics

Cited By

3.58

FWCI (Field Weighted Citation Impact)

Refs

0.95

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Formal Methods in Verification

Physical Sciences → Computer Science → Computational Theory and Mathematics

Receptor Mechanisms and Signaling

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Correct-by-synthesis reinforcement learning with temporal logic constraints

Abstract

Metrics

Citation History

Topics

Related Documents

Reinforcement Learning with Temporal Logic Constraints

Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning

Reinforcement learning with soft temporal logic constraints using limit-deterministic generalized Büchi automaton

Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation

Reinforcement learning under temporal logic constraints as a sequence modeling problem