Distributed On-Policy Actor-Critic Reinforcement Learning

Miloš S. Stanković; Miloš Beko; Miloš Pavlović; Ilija Popadić; Srđan Stanković

doi:10.15308/sinteza-2022-389-393

ScienceGate Book Chapters

JOURNAL ARTICLE

Distributed On-Policy Actor-Critic Reinforcement Learning

Miloš S. Stanković Miloš Beko Miloš Pavlović Ilija Popadić Srđan Stanković

Year: 2022 Pages: 389-393

DOI: 10.15308/sinteza-2022-389-393

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, a novel distributed on-policy Actor-Critic algorithm for multiagent reinforcement learning is proposed.The algorithm consists of the temporal difference scheme with function approximation at the Critic stage, and a policy gradient algorithm at the Actor stage, derived starting from a global objective.At both stages, decentralized agreement among the agents is achieved using the linear dynamic consensus strategy.Compared to the existing schemes, the algorithm has improved convergence rate and noise immunity, and a possibility to achieve multi-task global optimization.

Keywords:

Reinforcement learning Computer science Artificial intelligence Human–computer interaction

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.07

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Distributed On-Policy Actor-Critic Reinforcement Learning

Abstract

Metrics

Topics

Related Documents

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning

Distributed Multi-Agent Reinforcement Learning by Actor-Critic Method

Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning

Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning

Supervised Actor-Critic Reinforcement Learning