Structured Reward Shaping using Signal Temporal Logic specifications

Anand Balakrishnan; Jyotirmoy V. Deshmukh

doi:10.1109/iros40897.2019.8968254

ScienceGate Book Chapters

JOURNAL ARTICLE

Structured Reward Shaping using Signal Temporal Logic specifications

Anand Balakrishnan Jyotirmoy V. Deshmukh

Year: 2019 Pages: 3481-3486

DOI: 10.1109/iros40897.2019.8968254

Get Full-Text PDF Get Analytical Report

Abstract

Deep reinforcement learning has become a popular technique to train autonomous agents to learn control policies that enable them to accomplish complex tasks in uncertain environments. A key component of an RL algorithm is the definition of a reward function that maps each state and an action that can be taken in that state to some real-valued reward. Typically, reward functions informally capture an implicit (albeit vague) specification on the desired behavior of the agent. In this paper, we propose the use of the logical formalism of Signal Temporal Logic(STL) as a formal specification for the desired behaviors of the agent. Furthermore, we propose algorithms to locally shape rewards in each state with the goal of satisfying the high-level STL specification. We demonstrate our technique on two case studies, a cart-pole balancing problem with a discrete action space, and controlling the actuation of a simulated quadrotor for point-to-point movement.The proposed framework is agnostic to any specific RL algorithm, as locally shaped rewards can be easily used in concert with any deep RL algorithm.

Keywords:

Computer science Reinforcement learning Temporal logic Formalism (music) Artificial intelligence State (computer science) Function (biology) Point (geometry) State space Theoretical computer science Algorithm Mathematics

Metrics

Cited By

5.14

FWCI (Field Weighted Citation Impact)

Refs

0.95

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Formal Methods in Verification

Physical Sciences → Computer Science → Computational Theory and Mathematics

Logic, Reasoning, and Knowledge

Physical Sciences → Computer Science → Artificial Intelligence

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Structured Reward Shaping using Signal Temporal Logic specifications

Abstract

Metrics

Citation History

Topics

Related Documents

Funnel-Based Reward Shaping for Signal Temporal Logic Tasks in Reinforcement Learning

Reactive synthesis from signal temporal logic specifications

Active Learning of Signal Temporal Logic Specifications

Survey on mining signal temporal logic specifications

Preferences on Partial Satisfaction using Weighted Signal Temporal Logic Specifications