JOURNAL ARTICLE

Augmented Proximal Policy Optimization for Safe Reinforcement Learning

Juntao DaiJiaming JiYang LongQian ZhengGang Pan

Year: 2023 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 37 (6)Pages: 7288-7295   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

Safe reinforcement learning considers practical scenarios that maximize the return while satisfying safety constraints. Current algorithms, which suffer from training oscillations or approximation errors, still struggle to update the policy efficiently with precise constraint satisfaction. In this article, we propose Augmented Proximal Policy Optimization (APPO), which augments the Lagrangian function of the primal constrained problem via attaching a quadratic deviation term. The constructed multiplier-penalty function dampens cost oscillation for stable convergence while being equivalent to the primal constrained problem to precisely control safety costs. APPO alternately updates the policy and the Lagrangian multiplier via solving the constructed augmented primal-dual problem, which can be easily implemented by any first-order optimizer. We apply our APPO methods in diverse safety-constrained tasks, setting a new state of the art compared with a comprehensive list of safe RL baselines. Extensive experiments verify the merits of our method in easy implementation, stable convergence, and precise cost control.

Keywords:
Augmented Lagrangian method Reinforcement learning Mathematical optimization Computer science Convergence (economics) Multiplier (economics) Lagrange multiplier Penalty method Quadratic equation Sequential quadratic programming Lagrangian relaxation Constraint (computer-aided design) Dual (grammatical number) Quadratic programming Mathematics Artificial intelligence

Metrics

12
Cited By
1.73
FWCI (Field Weighted Citation Impact)
40
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Multi-Objective Optimization Algorithms
Physical Sciences →  Computer Science →  Computational Theory and Mathematics

Related Documents

JOURNAL ARTICLE

Coupled Penalties-Augmented Proximal Policy Optimization for Safe Reinforcement Learning

Ning PangLongyang HuangWeidong Zhang

Journal:   Journal of Physics Conference Series Year: 2025 Vol: 3077 (1)Pages: 012002-012002
JOURNAL ARTICLE

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Linrui ZhangLi ShenLong YangShixiang ChenXueqian WangBo YuanDacheng Tao

Journal:   Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence Year: 2022 Pages: 3744-3750
JOURNAL ARTICLE

Policy Optimization in Reinforcement Learning: Proximal Policy Optimization

Saurugger, Bernd

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2023
JOURNAL ARTICLE

Policy Optimization in Reinforcement Learning: Proximal Policy Optimization

Saurugger, Bernd

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2023
JOURNAL ARTICLE

Convergent Policy Optimization for Safe Reinforcement Learning

Ming YuZhuoran YangMladen KolarZhaoran Wang

Journal:   arXiv (Cornell University) Year: 2019 Vol: 32 Pages: 3121-3133
© 2026 ScienceGate Book Chapters — All rights reserved.