JOURNAL ARTICLE

Offline–Online Actor–Critic

Xuesong WangDiyuan HouLongyang HuangYuhu Cheng

Year: 2022 Journal:   IEEE Transactions on Artificial Intelligence Vol: 5 (1)Pages: 61-69   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Offline–online reinforcement learning (RL) can effectively address the problem of missing data (commonly known as transition) in offline RL. However, due to the effect of distribution shift, the performance of policy may degrade when an agent moves from offline to online training phases. In this article, we first analyze the problems of distribution shift and policy performance degradation in offline–online RL. Then, in order to alleviate these problems, we propose a novel RL algorithm offline–online actor–critic (O2AC) algorithm. In O2AC, a behavior clone constraint term is introduced into the policy objective function to address the distribution shift in offline training phase. In addition, in online training phase, the influence of the behavior clone constraint term is gradually reduced, which alleviates the policy performance degradation. Experiments show that O2AC outperforms existing offline–online RL algorithms.

Keywords:
Computer science Internet privacy

Metrics

1
Cited By
0.20
FWCI (Field Weighted Citation Impact)
34
Refs
0.55
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Digital Games and Media
Social Sciences →  Social Sciences →  Sociology and Political Science
Artificial Intelligence in Games
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Mild Policy Evaluation for Offline Actor–Critic

Longyang HuangBotao DongJinhui LuWeidong Zhang

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2023 Vol: 35 (12)Pages: 17950-17964
JOURNAL ARTICLE

Dual Behavior Regularized Offline Deterministic Actor–Critic

Shuo CaoXuesong WangYuhu Cheng

Journal:   IEEE Transactions on Systems Man and Cybernetics Systems Year: 2024 Vol: 54 (8)Pages: 4841-4852
JOURNAL ARTICLE

Offline Robustness of Distributional Actor-Critic Ensemble Reinforcement Learning

Zihang MaDaphne Teck Ching LaiJianxiang ZhuYaxin Peng

Journal:   Advances in Pure Mathematics Year: 2025 Vol: 15 (04)Pages: 269-290
JOURNAL ARTICLE

Robust Offline Actor-Critic with On-Policy Regularized Policy Evaluation

Shuo CaoXuesong WangYuhu Cheng

Journal:   IEEE/CAA Journal of Automatica Sinica Year: 2024 Vol: 11 (12)Pages: 2497-2511
© 2026 ScienceGate Book Chapters — All rights reserved.