One of the most important issues in scaling up reinforcement learning for practical problems is how to represent and store cost-to-go functions with more compact representations than lookup tables. We address the issue of combining the simple function approximation method-state aggregation with minimax-based reinforcement learning algorithms and present the convergence theory for online Q-hat-learning with state aggregation. Some empirical results are also included.
Suman ChakravortyDavid C. Hyland
Kao‐Shing HwangYu-Jen ChenWei‐Cheng Jiang
Matthias DeneckeKohji DohsakaMikio Nakano
Sulaeman SantosoIping Supriana