Yongkai TianXin YuYue QiLi WangPu FengWenjun WuRongye ShiJie Luo
Achieving high sample efficiency is a critical research area in reinforcement learning. This becomes extremely difficult in multi-agent reinforcement learning (MARL), as the capacity of the joint state and action space grows exponentially with the number of agents. The reliance of MARL solely on exploration and trial-and-error, without incorporating prior knowledge, exacerbates the issue of low sample efficiency. Currently, introducing symmetry into MARL is an effective approach to address this issue. Yet the concept of hierarchical symmetry, which maintains symmetry across different levels of a multi-agent system (MAS), has not been explored in existing methods. This paper focuses on multi-agent cooperative tasks and proposes a method incorporating hierarchical symmetry, termed the Hierarchical Equivariant Policy Network (HEPN) which is O(n)-equivariant. Specifically, HEPN utilizes clustering to perform hierarchical information extraction in MAS, and employs graph neural networks to model agent interactions. We conducted extensive experiments across various multi-agent tasks. The results indicate that our method achieves faster convergence speeds and higher convergence rewards compared to baseline algorithms. Additionally, we have deployed our algorithm in a physical multi-robot system, confirming its effectiveness in real-world environments. Supplementary materials are available at https://yongkai-tian.github.io/HEPN/.
Xin YuRongye ShiPu FengYongkai TianJie LuoWenjun Wu
Rajbala MakarSridhar MahadevanMohammad Ghavamzadeh
Mohammad GhavamzadehSridhar MahadevanRajbala Makar
Yongkai TianYue QiXin YuWenjun WuJie Luo