JOURNAL ARTICLE

Hierarchical Multi-Agent Reinforcement Learning with Intrinsic Reward Rectification

Abstract

Hierarchical reinforcement learning (HRL) is a promising approach to solving long-term decision problems and complex tasks, as high-level policy can guide the training procedure of low-level policy with macro actions and intrinsic rewards. However, the amount that macro actions influence decision-making, which affects how much internal rewards should be given to low-level policy, is disregarded by current HRL algorithms. It may be reasonable to provide low-level policy with less intrinsic rewards if macro actions are less important in decision-making. In this paper, we propose a value decomposition based hierarchical multi-agent reinforcement learning method with intrinsic reward rectification, which can determine the effectiveness of macro actions and correct the intrinsic rewards. We show that our proposed method significantly outperforms the state-of-the-art value decomposition approaches on the StarCraft Multi-Agent Challenge platform.

Keywords:
Reinforcement learning Macro Rectification Computer science Artificial intelligence Intrinsic motivation Intrinsic value (animal ethics) Decomposition Reinforcement Machine learning Value (mathematics) Engineering Psychology

Metrics

2
Cited By
0.51
FWCI (Field Weighted Citation Impact)
29
Refs
0.64
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Mobile Crowdsensing and Crowdsourcing
Physical Sciences →  Computer Science →  Computer Science Applications
Distributed Control Multi-Agent Systems
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.