Mutual Deep Deterministic Policy Gradient Learning

Zhou Sun

doi:10.1109/bdicn55575.2022.00099

ScienceGate Book Chapters

JOURNAL ARTICLE

Mutual Deep Deterministic Policy Gradient Learning

Zhou Sun

Year: 2022 Journal: 2022 International Conference on Big Data, Information and Computer Network (BDICN) Pages: 508-513

DOI: 10.1109/bdicn55575.2022.00099

Get Full-Text PDF Get Analytical Report

Abstract

In deep reinforcement learning (DRL), policy gradient (PG) and actor-critic (AC) based methods are among the most populous and effective methods for training DRL agents. One such method is the state-of-the-art deep deterministic policy gradient (DDPG). In this research, we employ the framework of mutual learning with DDPG to present a novel, Mutual DDPG (MuDDPG) agent with the aim to improve the performance and robustness of conventional DDPG. We also propose an additional simple innovation of adaptive reward-based exploration to further improve the rate of learning. We demonstrate that by employing these schemes, MuDDPG can converge faster and perform better than vanilla DDPG in two simple simulated tasks while adding significant robustness to the learning process.

Keywords:

Robustness (evolution) Reinforcement learning Computer science Artificial intelligence Mathematical optimization Mathematics

Metrics

Cited By

0.12

FWCI (Field Weighted Citation Impact)

Refs

0.24

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Mutual Deep Deterministic Policy Gradient Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Reinforcement Learning with Deep Deterministic Policy Gradient

Deep Reinforcement Learning with Robust Deep Deterministic Policy Gradient

State Representation Learning for Minimax Deep Deterministic Policy Gradient

Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm

Knowledge guided deep deterministic policy gradient