Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning

He Ba; Jiajun Fan; Xian Guo; Jianye Hao

doi:10.1109/icarm52023.2021.9536131

ScienceGate Book Chapters

JOURNAL ARTICLE

Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning

He Ba Jiajun Fan Xian Guo Jianye Hao

Year: 2021 Pages: 716-722

DOI: 10.1109/icarm52023.2021.9536131

Get Full-Text PDF Get Analytical Report

Abstract

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods from AlphaGo to Muzero have enjoyed huge success in discrete domains, such as chess and Go. Unfortunately, in real-world applications like robot control and inverted pendulum, whose action space is normally continuous, those tree-based planning techniques will be struggling. To address those limitations, in this paper, we present a novel model-based reinforcement learning frameworks called Critic PI2, which combines the benefits from trajectory optimization, deep actor-critic learning, and model-based reinforcement learning. Our method is evaluated for inverted pendulum models with applicability to many continuous control systems. Extensive experiments demonstrate that Critic PI2 achieved a new state of the art in a range of challenging continuous domains. Furthermore, we show that planning with a critic significantly increases the sample efficiency and real-time performance. Our work opens a new direction toward learning the components of a model-based planning system and how to use them.

Keywords:

Reinforcement learning Motion planning Computer science Inverted pendulum Artificial intelligence Trajectory Tree (set theory) Robot Control (management) Path (computing) State space Machine learning Mathematics

Metrics

Cited By

0.14

FWCI (Field Weighted Citation Impact)

Refs

0.55

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Artificial Intelligence in Games

Physical Sciences → Computer Science → Artificial Intelligence

Robotic Path Planning Algorithms

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Coverage Path Planning Using Actor–Critic Deep Reinforcement Learning

Broad Critic Deep Actor Reinforcement Learning for Continuous Control

Visual Navigation with Actor-Critic Deep Reinforcement Learning

Distributed On-Policy Actor-Critic Reinforcement Learning

Integrated Actor-Critic for Deep Reinforcement Learning