JOURNAL ARTICLE

Parallel Cross Entropy Policy Gradient Adaptive Dynamic Programming for Optimal Tracking Control of Discrete-Time Nonlinear Systems

Jiahui XuJingcheng WangJun RaoYanjiu ZhongShunyu WuQifang Sun

Year: 2024 Journal:   IEEE Transactions on Systems Man and Cybernetics Systems Vol: 54 (6)Pages: 3809-3821   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Policy gradient adaptive dynamic programming (PGADP) is a recently acclaimed control technique for the optimal control design of nonlinear systems. Nevertheless, it demands a substantial amount of interaction data with the controlled system, which can prove costly or perilous in certain scenarios. This article introduces a parallel cross entropy optimization method-based PGADP (PCEOM-PGADP) algorithm, with the objective of devising an optimal tracking controller for discrete-time nonlinear systems. The tracking problem is transformed into a regulation problem by constructing a tracking error system. Furthermore, the implementation of the proposed algorithm employs an actor–critic structure, where the actor network represents the control policy and the critic network assesses its performance. Through the iterative interaction, the optimal policy is ultimately derived. The approach also leverages the parallel cross entropy optimization method (PCEOM) to acquire a reasonable initial control policy for PGADP, thereby accelerating the efficiency of the learning process. Convergence analysis of the algorithm is conducted by demonstrating that the generated $Q$ function constitutes a monotonically nonincreasing sequence. Finally, the effectiveness of the proposed PCEOM-PGADP algorithm is verified through simulation on a complex automated driving tracking system.

Keywords:
Dynamic programming Computer science Mathematical optimization Optimal control Monotonic function Nonlinear system Discrete time and continuous time Entropy (arrow of time) Convergence (economics) Tracking error Control theory (sociology) Mathematics Algorithm Control (management) Artificial intelligence

Metrics

2
Cited By
1.58
FWCI (Field Weighted Citation Impact)
47
Refs
0.73
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Adaptive Dynamic Programming Control
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Mechanical Circulatory Support Devices
Physical Sciences →  Engineering →  Biomedical Engineering
Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.