Cooperative Learning for Adversarial Multi-Armed Bandit on Open Multi-Agent Systems

Tomoki Nakamura; Naoki Hayashi; Masahiro Inuiguchi

doi:10.1109/lcsys.2023.3279788

ScienceGate Book Chapters

JOURNAL ARTICLE

Cooperative Learning for Adversarial Multi-Armed Bandit on Open Multi-Agent Systems

Tomoki Nakamura Naoki Hayashi Masahiro Inuiguchi

Year: 2023 Journal: IEEE Control Systems Letters Vol: 7 Pages: 1712-1717 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/lcsys.2023.3279788

Get Full-Text PDF Get Analytical Report

Abstract

This paper considers a cooperative decision-making method for an adversarial bandit problem on open multi-agent systems. In an open multi-agent system, the network configuration changes dynamically as agents freely enter and leave the network. We propose a distributed Exp3 policy in which a group of agents exchanges the estimation of the expected reward of each arm with active neighboring agents. Then, each agent updates the probability distribution of choosing arms by combining the estimated rewards of neighboring agents. We derive a sufficient condition for a sublinear bound of a pseudo regret. The numerical example shows that active agents can cooperatively find the optimal arm by the proposed Exp3 policy algorithm.

Keywords:

Regret Sublinear function Adversarial system Computer science Multi-agent system Mathematical optimization Multi-armed bandit Artificial intelligence Mathematics Machine learning Discrete mathematics

Metrics

Cited By

2.91

FWCI (Field Weighted Citation Impact)

Refs

0.89

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Bandit Algorithms Research

Social Sciences → Decision Sciences → Management Science and Operations Research

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Distributed Sensor Networks and Detection Algorithms

Physical Sciences → Computer Science → Computer Networks and Communications

Cooperative Learning for Adversarial Multi-Armed Bandit on Open Multi-Agent Systems

Abstract

Metrics

Citation History

Topics

Related Documents

Adversarial Multi-armed Bandit

Learning Cooperative Behaviours in Adversarial Multi-agent Systems

Collaborative Multi-Agent Multi-Armed Bandit Learning for Small-Cell Caching

Bridging Adversarial and Nonstationary Multi-Armed Bandit

Decentralized Multi-Agent Multi-Armed Bandit Learning With Calibration for Multi-Cell Caching