DISSERTATION

Knowledge-based multi-objective multi-agent reinforcement learning

Patrick Mannion

Year: 2017 University:   Spectrum Research Repository (Concordia University)   Publisher: Concordia University

Abstract

Multi-Agent Reinforcement Learning (MARL) is a powerful Machine Learning paradigm, where multiple autonomous agents can learn to improve the performance of a system through experience. The majority of MARL implementations aim to optimise systems with respect to a single objective, despite the fact that many real world problems are inherently multi-objective in nature. Examples of multi-objective problems where MARL may be applied include water resource management, traffic signal control, electricity generator scheduling and robot coordination tasks. Compromises between conflicting objectives may be defined using the concept of Pareto dominance. The Pareto optimal or non-dominated set consists of solutions that are incomparable, where each solution in the set is not dominated by any of the others on every objective. Reward shaping has been proposed as a means to address the credit assignment problem in single-objective MARL, however it has been shown to alter the intended goals of the domain if misused, leading to unintended behaviour. Potential-Based Reward Shaping (PBRS) and difference rewards (D) are commonly used shaping methods for MARL, both of which have been repeatedly shown to improve learning speed and the quality of joint policies learned by agents in single-objective problems. Research into multi-objective MARL is still in its infancy, and very few studies have dealt with the issue of credit assignment in this context. This thesis explores the possibility of using reward shaping to improve agent coordination in multi-objective MARL domains. The implications of using either D or PBRS are evaluated from a theoretical perspective, and the results of several empirical studies support the conclusion that these shaping techniques do not alter the true Pareto optimal solutions in multi-objective MARL domains. Therefore, the benefits of reward shaping can now be leveraged in a broader range of application domains, without the risk of altering the agents' intended goals.

Keywords:
Reinforcement learning Computer science Reinforcement Artificial intelligence Machine learning Psychology Social psychology

Metrics

2
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Elevator Systems and Control
Physical Sciences →  Engineering →  Control and Systems Engineering
Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Metaheuristic Optimization Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.