JOURNAL ARTICLE

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning

Qiyuan ZhangShu LengXiaoteng MaQihan LiuXueqian WangBin LiangYu LiuJun Yang

Year: 2024 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 36 (1)Pages: 830-841   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Current constrained reinforcement learning (RL) methods guarantee constraint satisfaction only in expectation, which is inadequate for safety-critical decision problems. Since a constraint satisfied in expectation remains a high probability of exceeding the cost threshold, solving constrained RL problems with high probabilities of satisfaction is critical for RL safety. In this work, we consider the safety criterion as a constraint on the conditional value-at-risk (CVaR) of cumulative costs, and propose the CVaR-constrained policy optimization algorithm (CVaR-CPO) to maximize the expected return while ensuring agents pay attention to the upper tail of constraint costs. According to the bound on the CVaR-related performance between two policies, we first reformulate the CVaR-constrained problem in augmented state space using the state extension procedure and the trust-region method. CVaR-CPO then derives the optimal update policy by applying the Lagrangian method to the constrained optimization problem. In addition, CVaR-CPO utilizes the distribution of constraint costs to provide an efficient quantile-based estimation of the CVaR-related value function. We conduct experiments on constrained control tasks to show that the proposed method can produce behaviors that satisfy safety constraints, and achieve comparable performance to most safe RL (SRL) methods.

Keywords:
CVAR Mathematical optimization Constraint (computer-aided design) Computer science Quantile Expected shortfall Reinforcement learning Mathematics Risk management Economics Econometrics Artificial intelligence

Metrics

22
Cited By
14.05
FWCI (Field Weighted Citation Impact)
48
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Formal Methods in Verification
Physical Sciences →  Computer Science →  Computational Theory and Mathematics

Related Documents

JOURNAL ARTICLE

Game-Theoretic Constrained Policy Optimization for Safe Reinforcement Learning

Changxin ZhangXinglong ZhangYixing LanHao GaoXin Xu

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2025 Vol: 36 (10)Pages: 17990-18004
JOURNAL ARTICLE

Convergent Policy Optimization for Safe Reinforcement Learning

Ming YuZhuoran YangMladen KolarZhaoran Wang

Journal:   arXiv (Cornell University) Year: 2019 Vol: 32 Pages: 3121-3133
JOURNAL ARTICLE

Augmented Proximal Policy Optimization for Safe Reinforcement Learning

Juntao DaiJiaming JiYang LongQian ZhengGang Pan

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2023 Vol: 37 (6)Pages: 7288-7295
© 2026 ScienceGate Book Chapters — All rights reserved.