Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Benjamin Freed; Aditya Kapoor; Ian Abraham; Jeff Schneider; Howie Choset

doi:10.1109/lra.2021.3135930

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Benjamin Freed Aditya Kapoor Ian Abraham Jeff Schneider Howie Choset

Year: 2021 Journal: IEEE Robotics and Automation Letters Vol: 7 (2)Pages: 890-897 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/lra.2021.3135930

Get Full-Text PDF Get Analytical Report

Abstract

One of the preeminent obstacles to scaling multi-agent reinforcement learning\nto large numbers of agents is assigning credit to individual agents' actions.\nIn this paper, we address this credit assignment problem with an approach that\nwe call \\textit{partial reward decoupling} (PRD), which attempts to decompose\nlarge cooperative multi-agent RL problems into decoupled subproblems involving\nsubsets of agents, thereby simplifying credit assignment. We empirically\ndemonstrate that decomposing the RL problem using PRD in an actor-critic\nalgorithm results in lower variance policy gradient estimates, which improves\ndata efficiency, learning stability, and asymptotic performance across a wide\narray of multi-agent RL tasks, compared to various other actor-critic\napproaches. Additionally, we relate our approach to counterfactual multi-agent\npolicy gradient (COMA), a state-of-the-art MARL algorithm, and empirically show\nthat our approach outperforms COMA by making better use of information in\nagents' reward streams, and by enabling recent advances in advantage estimation\nto be used.\n

Keywords:

Decoupling (probability) Computer science Artificial intelligence Engineering Control engineering

Metrics

Cited By

0.42

FWCI (Field Weighted Citation Impact)

Refs

0.70

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Auction Theory and Applications

Social Sciences → Decision Sciences → Management Science and Operations Research

Data Stream Mining Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Abstract

Metrics

Citation History

Topics

Related Documents

Learning Cooperative Multi-Agent Policies with Multi-Channel Reward Curriculum Based Q-Learning

Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward

Learning Reward Machines in Cooperative Multi-agent Tasks

Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning

Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning