Model-Based Reinforcement Learning via Proximal Policy Optimization

Yuewen Sun; Xin Yuan; Wenzhang Liu; Changyin Sun

doi:10.1109/cac48633.2019.8996875

ScienceGate Book Chapters

JOURNAL ARTICLE

Model-Based Reinforcement Learning via Proximal Policy Optimization

Yuewen Sun Xin Yuan Wenzhang Liu Changyin Sun

Year: 2019 Pages: 4736-4740

DOI: 10.1109/cac48633.2019.8996875

Get Full-Text PDF Get Analytical Report

Abstract

Proximal policy optimization (PPO) is the state-of the-art most effective model-free reinforcement learning algorithm. Its powerful policy search ability allows an agent to find the optimal policy by trial and error but leads to high computation and low data-efficiency. Model-based algorithms can make the most efficient use of data by learning a forward model from observation, but face the challenge of model error. In this paper, we combine the strengths of both algorithms and introduce a data-efficient model-based approach called PIPPO (probabilistic inference via PPO). It makes online probabilistic dynamic model inference based on Gaussian process regression and executes offline policy improvement using PPO on the inferred model. Empirical evaluation on the pendulum benchmark problem shows that the proposed PIPPO algorithm has comparable performance and less interaction with the environment compared with traditional PPO.

Keywords:

Computer science Reinforcement learning Benchmark (surveying) Probabilistic logic Inference Gaussian process Machine learning Artificial intelligence Computation Gaussian Algorithm

Metrics

Cited By

1.23

FWCI (Field Weighted Citation Impact)

Refs

0.84

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Gaussian Processes and Bayesian Inference

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Bandit Algorithms Research

Social Sciences → Decision Sciences → Management Science and Operations Research

Model-Based Reinforcement Learning via Proximal Policy Optimization

Abstract

Metrics

Citation History

Topics

Related Documents

Model-based Ensemble Reinforcement Learning with Soft Proximal Policy Optimization

Policy Optimization in Reinforcement Learning: Proximal Policy Optimization

Policy Optimization in Reinforcement Learning: Proximal Policy Optimization

Proximal Policy Optimization Based Decentralized Networked Multi-Agent Reinforcement Learning

Economic dispatch based on reinforcement learning using proximal policy optimization