Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning

Botao Dong; Longyang Huang; Ning Pang; Hongtian Chen; Weidong Zhang

doi:10.1109/tnnls.2024.3481887

ScienceGate Book Chapters

JOURNAL ARTICLE

Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning

Botao Dong Longyang Huang Ning Pang Hongtian Chen Weidong Zhang

Year: 2024 Journal: IEEE Transactions on Neural Networks and Learning Systems Vol: 36 (7)Pages: 13446-13459 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tnnls.2024.3481887

Get Full-Text PDF Get Analytical Report

Abstract

The challenge of the exploration-exploitation dilemma persists in off-policy reinforcement learning (RL) algorithms, impeding the improvement of policy performance and sample efficiency. To tackle this challenge, a novel historical decision-making regularized maximum entropy (HDMRME) RL algorithm is developed to strike the balance between exploration and exploitation. Built upon the maximum entropy RL framework, the historical decision-making regularization method is proposed to enhance the exploitation capability of RL policies. The theoretical analysis involves proving the convergence of HDMRME, investigating the tradeoff between exploration and exploitation of HDMRME, examining the disparity between the Q-function learned through HDMRME and the classic one, and analyzing the suboptimality of the trained policy. The performance of HDMRME is evaluated across various continuous-action control tasks from Mujoco and OpenAI Gym platforms. Comparative experiments demonstrate that HDMRME exhibits superior sample efficiency and achieves more competitive performance compared with other state-of-the-art RL algorithms.

Keywords:

Reinforcement learning Principle of maximum entropy Artificial intelligence Reinforcement Computer science Machine learning Mathematics Psychology Social psychology

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.18

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Neural Networks and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning

Abstract

Metrics

Topics

Related Documents

Maximum Entropy Inverse Reinforcement Learning

Market making through reinforcement learning and Wasserstein DRO : an entropy-regularized approach

Entropy regularized reinforcement learning using large deviation theory

Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning

Value Conditional State Entropy Reinforcement Learning for Autonomous Driving Decision Making