Feature Markov Decision Processes (MDPs) [Hut09] are well-suited for learning agents in general environments. Nevertheless, unstructured ()MDPs are limited to rela- tively simple environments. Structured MDPs like Dynamic Bayesian Networks (DBNs) are used for large-scale real- world problems. In this article I extend MDP to DBN. The primary contribution is to derive a cost criterion that al- lows to automatically extract the most relevant features from the environment, leading to the "best" DBN representation. I discuss all building blocks required for a complete general learning algorithm.
Joe FrankelMirjam WesterSimon King
Joe FrankelMirjam WesterSimon King
Karen LivescuJames GlassJeff Bilmes