JOURNAL ARTICLE

Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning

Botao DongLongyang HuangNing PangHongtian ChenWeidong Zhang

Year: 2024 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 36 (7)Pages: 13446-13459   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The challenge of the exploration-exploitation dilemma persists in off-policy reinforcement learning (RL) algorithms, impeding the improvement of policy performance and sample efficiency. To tackle this challenge, a novel historical decision-making regularized maximum entropy (HDMRME) RL algorithm is developed to strike the balance between exploration and exploitation. Built upon the maximum entropy RL framework, the historical decision-making regularization method is proposed to enhance the exploitation capability of RL policies. The theoretical analysis involves proving the convergence of HDMRME, investigating the tradeoff between exploration and exploitation of HDMRME, examining the disparity between the Q-function learned through HDMRME and the classic one, and analyzing the suboptimality of the trained policy. The performance of HDMRME is evaluated across various continuous-action control tasks from Mujoco and OpenAI Gym platforms. Comparative experiments demonstrate that HDMRME exhibits superior sample efficiency and achieves more competitive performance compared with other state-of-the-art RL algorithms.

Keywords:
Reinforcement learning Principle of maximum entropy Artificial intelligence Reinforcement Computer science Machine learning Mathematics Psychology Social psychology

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
48
Refs
0.18
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Maximum Entropy Inverse Reinforcement Learning

Brian D. ZiebartAndrew L. MaasJ. Andrew BagnellAnind K. Dey

Journal:   Research Showcase @ Carnegie Mellon University (Carnegie Mellon University) Year: 2008 Pages: 1433-1438
DISSERTATION

Market making through reinforcement learning and Wasserstein DRO : an entropy-regularized approach

Fang, Zhou

University:   Texas Digital Library (University of Texas) Year: 2025
JOURNAL ARTICLE

Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning

Mohammadreza NakhaeinezhadfardAidan ScannellJoni Pajarinen

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2025 Vol: 39 (18)Pages: 19616-19623
© 2026 ScienceGate Book Chapters — All rights reserved.