We consider statistical Markov Decision Processes where the decision maker is\nrisk averse against model ambiguity. The latter is given by an unknown\nparameter which influences the transition law and the cost functions. Risk\naversion is either measured by the entropic risk measure or by the Average\nValue at Risk. We show how to solve these kind of problems using a general\nminimax theorem. Under some continuity and compactness assumptions we prove the\nexistence of an optimal (deterministic) policy and discuss its computation. We\nillustrate our results using an example from statistical decision theory.\n
Ariel NeufeldJulian SesterMario Šikić
Yasemin SerinVidyadhar G. Kulkarni
Neufeld, ArielSester, JulianSikic, Mario