In this work, we propose a constrained deep reinforcement learning (CDRL) based approach to address resource allocation for multi-target tracking in a radar system. In the proposed CDRL algorithm, both the parameters of the deep Q-network (DQN) and the dual variable are learned simultaneously. The proposed CDRL framework consists of two components, namely online CDRL and offline CDRL. Training a DQN in the deep reinforcement learning algorithm usually requires a large amount of data, which may not be available in a target tracking task due to the scarcity of measurements. We address this challenge by proposing an offline CDRL framework, in which the algorithm evolves in a virtual environment generated based on the current observations and prior knowledge of the environment. Simulation results show that both offline CDRL and online CDRL are critical. Offline CDRL provides more training data to stabilize the learning process and the online component can sense the change in the environment and make the corresponding adaptation.
Jianfei SunQiang GaoCong WuYuxian LiJiacheng WangDusit Niyato
Jiahao QinMengtao ZhuZesi PanYunjie LiYan LiYan LiYan Li
Abhinav BhatiaPradeep VarakanthamAkshat Kumar
Abhinav BhatiaPradeep VarakanthamAkshat Kumar