JOURNAL ARTICLE

Correct-by-synthesis reinforcement learning with temporal logic constraints

Abstract

We consider a problem on the synthesis of optimal reactive controllers with an a priori unknown performance criterion while satisfying a given temporal logic specification through the interaction with an uncontrolled environment. We decouple the problem into two sub-problems. First, we extract a (maximally) permissive strategy for the system, which encodes multiple (possibly all) ways in which the system can react to the adversarial environment and satisfy the specifications. Then, we quantify the a priori unknown performance criterion as a (still unknown) reward function, and compute - by using the so-called maximin-Q learning algorithm - an optimal strategy for the system within the operating envelope allowed by the permissive strategy. We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications. For specifications beyond this fragment, correctness can still be preserved, but the learned strategy may be sub-optimal. We present an algorithm to the overall problem, and demonstrate its use and computational requirements on a set of robot motion planning examples.

Keywords:
Correctness Temporal logic A priori and a posteriori Computer science Reinforcement learning Minimax Set (abstract data type) Fragment (logic) Robot Linear temporal logic Mathematical optimization Artificial intelligence Algorithm Theoretical computer science Mathematics

Metrics

53
Cited By
3.58
FWCI (Field Weighted Citation Impact)
22
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Formal Methods in Verification
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Receptor Mechanisms and Signaling
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Software Testing and Debugging Techniques
Physical Sciences →  Computer Science →  Software

Related Documents

JOURNAL ARTICLE

Reinforcement Learning with Temporal Logic Constraints

Bengt LennartsonQing‐Shan Jia

Journal:   IFAC-PapersOnLine Year: 2020 Vol: 53 (4)Pages: 485-492
JOURNAL ARTICLE

Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning

Derya AksarayYasin YazıcıoğluAhmet Semi Asarkaya

Journal:   2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Year: 2021 Vol: 16 Pages: 6531-6537
JOURNAL ARTICLE

Reinforcement learning under temporal logic constraints as a sequence modeling problem

Daiying TianHao FangQingkai YangHaoyong YuWenyu LiangYan Wu

Journal:   Robotics and Autonomous Systems Year: 2022 Vol: 161 Pages: 104351-104351
© 2026 ScienceGate Book Chapters — All rights reserved.