Setting up the correct hyperparameters in reinforcement learning (RL) algorithms is an important part to achieve good performance in its execution and convergence. Manual adjustment for these hyperparameters is not a good practice because it consumes too much time and effort, therefore, it is advisable to use computational tools to optimize this tuning. Evolutionary computation (EC) techniques can be a good tool to tune and optimize the hyperparameters in the different algorithms. In this project we used the genetic algorithms (GA) approach to find the value of the hyperparameters that best fit the performance of the SARSA and Q-learning RL algorithms, addressing the underactuated pendulum swing-up task, maximizing the final rewards acquired and the agent's learning speed. We obtained good solutions with a fairly simple algorithm, but required multiple random restarts of the GA to escape local minima.
Nathan VillavicencioMichael N. Groves
Gopal RawatSandeep KumarSaptadeepa Kalita
José Ramón QuevedoMarwan AbdelattiFarhad ImaniManbir Sodhi
Lino CostaAna Cristina BragaPedro Oliveira