Data-based Optimal Control for Discrete-time Systems via Deep Deterministic Policy Gradient Adaptive Dynamic Programming

Yongwei Zhang; Mingduo Lin; Derong Liu; Yuwen Qin; Bo Zhao

doi:10.1109/icist.2019.8836712

ScienceGate Book Chapters

JOURNAL ARTICLE

Data-based Optimal Control for Discrete-time Systems via Deep Deterministic Policy Gradient Adaptive Dynamic Programming

Yongwei Zhang Mingduo Lin Derong Liu Yuwen Qin Bo Zhao

Year: 2019 Pages: 1-5

DOI: 10.1109/icist.2019.8836712

Get Full-Text PDF Get Analytical Report

Abstract

The model-free optimal control problem for discrete-time systems is considered in this paper by using deep deterministic policy gradient adaptive dynamic programming (DDPGADP) algorithm. The system data is obtained by using the off-policy learning and the control law is updated by policy gradient. The convergence of DDPGADP algorithm is verified by showing that the Q-function sequence is monotonically non-increasing and converges to the optimum. In order to implement this method, an actor-critic neural network structure is established by adopting the target network technology from deep Q-learning during the neural network training process. Finally, simulation examples are presented to verify the effectiveness of the proposed method.

Keywords:

Computer science Monotonic function Artificial neural network Dynamic programming Convergence (economics) Reinforcement learning Process (computing) Mathematical optimization Function (biology) Gradient method Optimal control Discrete time and continuous time Sequence (biology) Adaptive control Control theory (sociology) Control (management) Algorithm Artificial intelligence Mathematics

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.15

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Adaptive Dynamic Programming Control

Physical Sciences → Computer Science → Computational Theory and Mathematics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Mechanical Circulatory Support Devices

Physical Sciences → Engineering → Biomedical Engineering

Data-based Optimal Control for Discrete-time Systems via Deep Deterministic Policy Gradient Adaptive Dynamic Programming

Abstract

Metrics

Topics

Related Documents

Twin Deterministic Policy Gradient Adaptive Dynamic Programming for Optimal Control of Affine Nonlinear Discrete-time Systems

Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control

Deterministic policy gradient adaptive dynamic programming for model-free optimal control

Parallel Cross Entropy Policy Gradient Adaptive Dynamic Programming for Optimal Tracking Control of Discrete-Time Nonlinear Systems

Event-Triggered Control of Discrete-Time Zero-Sum Games via Deterministic Policy Gradient Adaptive Dynamic Programming