This paper proposes Reactive Adaptive eXperience based congestion control (Rax), a new method of congestion control (CC) that uses online reinforcement learning (RL) to maintain an optimum congestion window with respect to a given reward function and based on current network conditions. We use a neural network based approach that can be initialized either with random weights or with a previously trained neural network to improve stability and convergence time. As the processing of rewards in CC depends on the arrival of acknowledgements, which are delayed and received one by one, the problem is not suitable for current implementations of Deep RL. As a remedy we propose Partial Action Learning, a formulation of Deep RL that supports delayed and partial rewards. We show that our method converges to a stable, close-to-optimum solution within minutes and outperforms existing CC algorithms in typical networks. Thus, this paper demonstrates that Deep RL can be done online and can compete with classic CC schemes such as Cubic.
Zihan JiaChen ChenLin GuanJohn Woodward
Zhenchang XiaLibing WuFei WangXudong LiaoHaiyan HuJia WuDan Wu
Raja Ali Farag ShawishAdel AneibaHaitham MahmoudDe Mi
Jean P. MartinsRicardo S. SouzaIgor AlmeidaSilvia Lins