Reinforcement learning has been explored in the context of policy-based autonomic management as a way to learn from past experience in order to choose the right action in the trial-and-error process. However, the time of learning is tedious in most cases, which prevents the reinforcement learning from practical applications on real-time control in the real world. In order to achieve the goal of shortening the training process and accelerating the learning speed, we put forward a hybrid reinforcement learning algorithm, which combines Q-learning, Prioritized Sweeping and Direct Exploration techniques to resolve this problem. In this paper, the work is presented in the context of a policy-based autonomic management system and a simulation has been conducted to demonstrate that our hybrid algorithm can significantly accelerate the learning process, essentially improving the overall quality of service in policy-based autonomic management.
Raphael M. BahatiMichael Bauer
Raphael M. BahatiMichael Bauer