Reinforcement Learning (RL) has become a widely used approach for pursuit-evasion games. However, the behavior of such RL models is hard to analyze, often leading to a lack of trust. This paper describes a study in which we used machine learning (ML) approaches to develop metareasoning policies that control pursuers' strategies. The proposed approach enables pursuer agents to capture a faster evader by choosing simple pursuit strategies collaboratively. The results show that some metareasoning policies perform better than any pursuer strategy combinations. Our approach provides an innovative way for the pursuer agents to reason about their opponents and adapt their strategy, which could have significant implications for the design of intelligent agents in real-world applications.
Penglin HuYaning GuoJinwen HuQuan Pan
Viliam LisýBranislav BošanskýMichal Pěchouček