BOOK-CHAPTER

Concurrent Multiagent Reinforcement Learning with Reward Machines

Abstract

Coordinating and synchronizing multiple agents in reinforcement learning (RL) presents significant challenges, particularly when concurrent actions and shared objectives are required. We propose a novel framework that integrates Reward Machines (RMs) with Partial-Order Planning (POP) to enhance coordination in multiagent reinforcement learning (MARL). By transforming high-level POP strategies into individual RMs for each agent, our approach explicitly captures action dependencies and concurrency requirements, enabling agents to learn and execute coordinated plans effectively in complex environments. We validate our approach in a grid-based multiagent domain in which agents have to synchronize actions such as jointly accessing limited pathways or collaboratively manipulating objects. The explicit representation of action dependencies and synchronization points in RMs provides a scalable and flexible mechanism to model concurrent actions, enabling agents to focus on relevant tasks and reducing exploration.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.58
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.