JOURNAL ARTICLE

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Benjamin FreedAditya KapoorIan AbrahamJeff SchneiderHowie Choset

Year: 2021 Journal:   IEEE Robotics and Automation Letters Vol: 7 (2)Pages: 890-897   Publisher: Institute of Electrical and Electronics Engineers

Abstract

One of the preeminent obstacles to scaling multi-agent reinforcement learning\nto large numbers of agents is assigning credit to individual agents' actions.\nIn this paper, we address this credit assignment problem with an approach that\nwe call \\textit{partial reward decoupling} (PRD), which attempts to decompose\nlarge cooperative multi-agent RL problems into decoupled subproblems involving\nsubsets of agents, thereby simplifying credit assignment. We empirically\ndemonstrate that decomposing the RL problem using PRD in an actor-critic\nalgorithm results in lower variance policy gradient estimates, which improves\ndata efficiency, learning stability, and asymptotic performance across a wide\narray of multi-agent RL tasks, compared to various other actor-critic\napproaches. Additionally, we relate our approach to counterfactual multi-agent\npolicy gradient (COMA), a state-of-the-art MARL algorithm, and empirically show\nthat our approach outperforms COMA by making better use of information in\nagents' reward streams, and by enabling recent advances in advantage estimation\nto be used.\n

Keywords:
Decoupling (probability) Computer science Artificial intelligence Engineering Control engineering

Metrics

4
Cited By
0.42
FWCI (Field Weighted Citation Impact)
38
Refs
0.70
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Auction Theory and Applications
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.