The trial-and-error mechanism of reinforcement learning essentially is a kind of exhaustive search, which is also the major reason to cause reinforcement learning being slow and time-consuming. We present an approach to model the state transitions in agent's exploration by Shaping Bayesian Network, which can be used to shape agent for bias exploration towards the most promising regions of state space and thereby reduce exploration and accelerate learning. The experiment results show this approach can significantly improve agent's performance and shorten learning time. More importantly, this approach provides a kind of way to make agent can take advantage of its own experience to accelerate learning.
Hyeoneun KimWoosang LimKanghoon LeeYung‐Kyun NohKee-Eung Kim
Jin ZhaoJian Gang JinJiong Song
Yiqiu HuYun HuaWenyan LiuJun Zhu