JOURNAL ARTICLE

Belief Reward Shaping in Reinforcement Learning

Ofir MaromBenjamin Rosman

Year: 2018 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 32 (1)   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

A key challenge in many reinforcement learning problems is delayed rewards, which can significantly slow down learning. Although reward shaping has previously been introduced to accelerate learning by bootstrapping an agent with additional information, this can lead to problems with convergence. We present a novel Bayesian reward shaping framework that augments the reward distribution with prior beliefs that decay with experience. Formally, we prove that under suitable conditions a Markov decision process augmented with our framework is consistent with the optimal policy of the original MDP when using the Q-learning algorithm. However, in general our method integrates seamlessly with any reinforcement learning algorithm that learns a value or action-value function through experience. Experiments are run on a gridworld and a more complex backgammon domain that show that we can learn tasks significantly faster when we specify intuitive priors on the reward distribution.

Keywords:
Reinforcement learning Markov decision process Computer science Artificial intelligence Bootstrapping (finance) Prior probability Bellman equation Convergence (economics) Machine learning Temporal difference learning Bayesian probability Process (computing) Markov process Mathematical optimization Mathematics Econometrics

Metrics

71
Cited By
3.08
FWCI (Field Weighted Citation Impact)
27
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Bandit Algorithms Research
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Multigrid Reinforcement Learning with Reward Shaping

Marek GrześDaniel Kudenko⋆

Lecture notes in computer science Year: 2008 Pages: 357-366
JOURNAL ARTICLE

Reward Shaping in Episodic Reinforcement Learning

Marek Grześ

Journal:   Adaptive Agents and Multi-Agents Systems Year: 2017 Pages: 565-573
JOURNAL ARTICLE

Reward Shaping Based Federated Reinforcement Learning

Yiqiu HuYun HuaWenyan LiuJun Zhu

Journal:   IEEE Access Year: 2021 Vol: 9 Pages: 67259-67267
JOURNAL ARTICLE

Hindsight Reward Shaping in Deep Reinforcement Learning

Byron de VilliersDeon Sabatta

Journal:   2020 International SAUPEC/RobMech/PRASA Conference Year: 2020 Vol: 521 Pages: 1-7
© 2026 ScienceGate Book Chapters — All rights reserved.