It is shown that frequency hopping and pulsewidth allocation strategy can provide enhanced anti-jamming performance for the radar systems. The current anti-jamming methods often have difficulty in adapting their policy to the complicated and unpredictable jamming environment. To address this limitation, a reinforcement learning-based joint adaptive frequency hopping and pulse-width allocation scheme is proposed. By applying the reinforcement learning, the radar can learn the optimized anti-jamming policy by interacting with the environment and requires little prior information. In the proposed scheme, we first establish a reward model to quantify the performance of radar anti-jamming decisions. Then, the radar anti-jamming decision process is modeled as a Markov decision process. As one of the widely-used reinforcement learning algorithms, the Q-learning, which can converge to the optimized policy with probability 1, is utilized to learn the optimized radar anti-jamming policy in the context of lacking a perfect environmental knowledge. Numerical results are shown to verify the effectiveness of our proposed strategy.
Kang LiBo JiuLiu HongweiSiyuan Liang
AiliyaWei YiPramod K. Varshney
Pengfei LiuLei WangShan ZhaoYimin Liu
Sixi ChengXiang LingLidong Zhu
Muhammad Majid AzizAbdur Rahman MaudAamir Habib