This paper proposes a convolutional multi-agent deep deterministic policy gradient method with prioritized experience replay (PER-CMADDPG) for the problem of multi-AUV cooperative search for moving targets. A comprehensive mathematical model of the multi-AUV cooperative search for moving targets is first established, which includes the environment model, the AUV model, and the information update and fusion model. Building upon the MADDPG framework, the proposed PER-CMADDPG method introduces two major enhancements. Convolutional neural networks (CNNs) are integrated into both the actor and critic networks to extract spatial features from local observation maps and global states, enabling agents to better perceive the spatial structure of the environment. In addition, a prioritized experience replay (PER) mechanism is incorporated to improve learning efficiency by emphasizing informative experiences during training, thereby accelerating policy convergence. Simulation experiments demonstrate that the proposed method achieves faster convergence and higher rewards compared with MADDPG. Furthermore, the influences of the multi-AUV cluster system’s scale, AUV speed, and sonar detection radius on performance are analyzed. The results verify the effectiveness of the proposed PER-CMADDPG method for the multi-AUV cooperative search for moving targets.
Yang SunZhenning WuQiyuan ZhangZongying ShiYisheng Zhong
Jingjing ZhangWeidong ZhouXiong DengJ. W. S. LiuShuo Yang
Yinjiang SunRui ZhangWenbao LiangXu Cheng
Yifei LiuXiaoshuai LiJian WangFeiyu WeiJunan Yang