Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning

Ronghui Mu; Wenjie Ruan; Leandro Soriano Marcolino; Gaojie Jin; Qiang Ni

doi:10.1609/aaai.v37i12.26756

ScienceGate Book Chapters

JOURNAL ARTICLE

Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning

Ronghui Mu Wenjie Ruan Leandro Soriano Marcolino Gaojie Jin Qiang Ni

Year: 2023 Journal: Proceedings of the AAAI Conference on Artificial Intelligence Vol: 37 (12)Pages: 15046-15054 Publisher: Association for the Advancement of Artificial Intelligence

DOI: 10.1609/aaai.v37i12.26756

Get Full-Text PDF Get Analytical Report

Abstract

Cooperative multi-agent reinforcement learning (c-MARL) is widely applied in safety-critical scenarios, thus the analysis of robustness for c-MARL models is profoundly important. However, robustness certification for c-MARLs has not yet been explored in the community. In this paper, we propose a novel certification method, which is the first work to leverage a scalable approach for c-MARLs to determine actions with guaranteed certified bounds. c-MARL certification poses two key challenges compared to single-agent systems: (i) the accumulated uncertainty as the number of agents increases; (ii) the potential lack of impact when changing the action of a single agent into a global team reward. These challenges prevent us from directly using existing algorithms. Hence, we employ the false discovery rate (FDR) controlling procedure considering the importance of each agent to certify per-state robustness. We further propose a tree-search-based algorithm to find a lower bound of the global reward under the minimal certified perturbation. As our method is general, it can also be applied in a single-agent environment. We empirically show that our certification bounds are much tighter than those of state-of-the-art RL certification solutions. We also evaluate our method on two popular c-MARL algorithms: QMIX and VDN, under two different environments, with two and four agents. The experimental results show that our method can certify the robustness of all c-MARL models in various environments. Our tool CertifyCMARL is available at https://github.com/TrustAI/CertifyCMARL.

Keywords:

Certification Robustness (evolution) Reinforcement learning Computer science Marl Scalability Leverage (statistics) Mathematical optimization Machine learning Mathematics Database Economics

Metrics

Cited By

11.93

FWCI (Field Weighted Citation Impact)

Refs

1.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Safety Systems Engineering in Autonomy

Physical Sciences → Engineering → Safety, Risk, Reliability and Quality

Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Hierarchical Policy Optimization for Cooperative Multi-Agent Reinforcement Learning

QDAP: Downsizing adaptive policy for cooperative multi-agent reinforcement learning

DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning

Multi-Agent Cooperative Fuzzy Reinforcement Learning

LAMARL: LLM-Aided Multi-Agent Reinforcement Learning for Cooperative Policy Generation