JOURNAL ARTICLE

Cooperative Learning for Adversarial Multi-Armed Bandit on Open Multi-Agent Systems

Tomoki NakamuraNaoki HayashiMasahiro Inuiguchi

Year: 2023 Journal:   IEEE Control Systems Letters Vol: 7 Pages: 1712-1717   Publisher: Institute of Electrical and Electronics Engineers

Abstract

This paper considers a cooperative decision-making method for an adversarial bandit problem on open multi-agent systems. In an open multi-agent system, the network configuration changes dynamically as agents freely enter and leave the network. We propose a distributed Exp3 policy in which a group of agents exchanges the estimation of the expected reward of each arm with active neighboring agents. Then, each agent updates the probability distribution of choosing arms by combining the estimated rewards of neighboring agents. We derive a sufficient condition for a sublinear bound of a pseudo regret. The numerical example shows that active agents can cooperatively find the optimal arm by the proposed Exp3 policy algorithm.

Keywords:
Regret Sublinear function Adversarial system Computer science Multi-agent system Mathematical optimization Multi-armed bandit Artificial intelligence Mathematics Machine learning Discrete mathematics

Metrics

9
Cited By
2.91
FWCI (Field Weighted Citation Impact)
31
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Bandit Algorithms Research
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Distributed Sensor Networks and Detection Algorithms
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

BOOK-CHAPTER

Adversarial Multi-armed Bandit

Rong ZhengCunqing Hua

Wireless networks Year: 2016 Pages: 41-57
BOOK-CHAPTER

Learning Cooperative Behaviours in Adversarial Multi-agent Systems

Ni WangGautham P. DasAlan G. Millard

Lecture notes in computer science Year: 2022 Pages: 179-189
JOURNAL ARTICLE

Collaborative Multi-Agent Multi-Armed Bandit Learning for Small-Cell Caching

Xianzhe XuMeixia TaoCong Shen

Journal:   IEEE Transactions on Wireless Communications Year: 2020 Vol: 19 (4)Pages: 2570-2585
JOURNAL ARTICLE

Bridging Adversarial and Nonstationary Multi-Armed Bandit

Ningyuan ChenShuoguang YangHailun Zhang

Journal:   Production and Operations Management Year: 2025 Vol: 34 (8)Pages: 2218-2231
JOURNAL ARTICLE

Decentralized Multi-Agent Multi-Armed Bandit Learning With Calibration for Multi-Cell Caching

Xianzhe XuMeixia Tao

Journal:   IEEE Transactions on Communications Year: 2020 Vol: 69 (4)Pages: 2457-2472
© 2026 ScienceGate Book Chapters — All rights reserved.