JOURNAL ARTICLE

Anti-Jamming Decision-Making for Phased-Array Radar Based on Improved Deep Reinforcement Learning

Hang ZhaoHu SongRong LiuJiao HouXianxiang Yu

Year: 2025 Journal:   Electronics Vol: 14 (11)Pages: 2305-2305   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

In existing phased-array radar systems, anti-jamming strategies are mainly generated through manual judgment. However, manually designing or selecting anti-jamming decisions is often difficult and unreliable in complex jamming environments. Therefore, reinforcement learning is applied to anti-jamming decision-making to solve the above problems. However, the existing anti-jamming decision-making models based on reinforcement learning often suffer from problems such as low convergence speeds and low decision-making accuracy. In this paper, a multi-aspect improved deep Q-network (MAI-DQN) is proposed to improve the exploration policy, the network structure, and the training methods of the deep Q-network. In order to solve the problem of the ϵ-greedy strategy being highly dependent on hyperparameter settings, and the Q-value being overly influenced by the action in other deep Q-networks, this paper proposes a structure that combines a noisy network, a dueling network, and a double deep Q-network, which incorporates an adaptive exploration policy into the neural network and increases the influence of the state itself on the Q-value. These enhancements enable a highly adaptive exploration strategy and a high-performance network architecture, thereby improving the decision-making accuracy of the model. In order to calculate the target value more accurately during the training process and improve the stability of the parameter update, this paper proposes a training method that combines n-step learning, target soft update, variable learning rate, and gradient clipping. Moreover, a novel variable double-depth priority experience replay (VDDPER) method that more accurately simulates the storage and update mechanism of human memory is used in the MAI-DQN. The VDDPER improves the decision-making accuracy by dynamically adjusting the sample size based on different values of experience during training, enhancing exploration during the early stages of training, and placing greater emphasis on high-value experiences in the later stages. Enhancements to the training method improve the model’s convergence speed. Moreover, a reward function combining signal-level and data-level benefits is proposed to adapt to complex jamming environments, which ensures a high reward convergence speed with fewer computational resources. The findings of a simulation experiment show that the proposed phased-array radar anti-jamming decision-making method based on MAI-DQN can achieve a high convergence speed and high decision-making accuracy in environments where deceptive jamming and suppressive jamming coexist.

Keywords:
Jamming Phased array Reinforcement learning Radar Computer science Reinforcement Engineering Artificial intelligence Telecommunications Physics Structural engineering

Metrics

1
Cited By
6.93
FWCI (Field Weighted Citation Impact)
0
Refs
0.93
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Radar Systems and Signal Processing
Physical Sciences →  Engineering →  Aerospace Engineering
Advanced SAR Imaging Techniques
Physical Sciences →  Engineering →  Aerospace Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.