This work presents the preliminary results of and discusses current challenges in ongoing research of neuroevolution for the task of evolving agents for autonomous cyber operations (ACO). The application of reinforcement learning to the cyber domain is especially challenging due to the extremely limited observability of the environment over extended time frames where an adversary can potentially take many actions without being detected. To promote research within this space The Technical Cooperation Program (TTCP), which is an international collaboration organization between the US, UK, Canada, Australia, and New Zealand, released the Cyber Operations Research Gym (CybORG) to enable experimentation with RL algorithms in both simulated and emulated environments. Using competition to spur investigation and innovation, TTCP has released the CAGE Challenges which for evaluating RL in network defense.[1] This work evolves agents for ACO using the python-based neuroevolution library Evosax[2] which supports high performance, GPU accelerated evolutionary algorithms for the purpose of optimizing artificial neural network parameters. The use of neuroevolution in this paper is a first for the ACO task and benchmarks two popular algorithms to identify factors which impact their effectiveness.
Ali Al MaqousiKashinath BasuAmjad AldweeshMouhammd Alkasassbeh
Salam Al-E’mariYousef SanjalaweFuad FataftahRula Yousef Hajjaj
Carol SmidtsXiaoxu DiaoPavan Kumar Vaddi