Safe Exploration in Reinforcement Learning for Learning from Human Experts

Jorge Ramírez; Wen Yu

doi:10.1109/aibthings58340.2023.10292489

ScienceGate Book Chapters

JOURNAL ARTICLE

Safe Exploration in Reinforcement Learning for Learning from Human Experts

Jorge Ramírez Wen Yu

Year: 2023 Pages: 1-5

DOI: 10.1109/aibthings58340.2023.10292489

Get Full-Text PDF Get Analytical Report

Abstract

This work introduces a new reinforcement learning (RL) method, which has safe exploration and estimates uncertainty in state-action pairs using Monte Carlo (MC) dropout. The proposed method outperforms biased exploration in terms of reward obtained during training. The study also investigates the sensitivity of the algorithm to the uncertainty threshold hyperparameter, suggesting that a lower value leads to a safer policy, but a higher value can result in faster convergence. The proposed algorithm is evaluated in guiding a 2 degree-of-freedom planar robot in its task-space, showing that it can converge to an optimal policy while ensuring safety constraints are met.

Keywords:

Reinforcement learning Hyperparameter SAFER Computer science Convergence (economics) Task (project management) Sensitivity (control systems) Artificial intelligence Machine learning Robot Dropout (neural networks) Monte Carlo method Space (punctuation) Mathematical optimization Mathematics Engineering Statistics

Metrics

Cited By

0.77

FWCI (Field Weighted Citation Impact)

Refs

0.73

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Neural dynamics and brain function

Life Sciences → Neuroscience → Cognitive Neuroscience

Simulation Techniques and Applications

Social Sciences → Decision Sciences → Management Science and Operations Research

Safe Exploration in Reinforcement Learning for Learning from Human Experts

Abstract

Metrics

Citation History

Topics

Related Documents

Safe Reinforcement Learning for Learning from Human Demonstrations

Safe Reinforcement Learning for Learning from Human Demonstrations

Safe Reinforcement Learning for Robotics: From Exploration to Policy Learning

Reinforcement Learning by Guided Safe Exploration

Human Demonstrations for Fast and Safe Exploration in Reinforcement Learning