Shiming QuanSu CaoChang WangHuangchao Yu
Focusing on continuous action space methods for autonomous maneuvering decision making in 1v1 unmanned aerial vehicle scenarios, this paper first establishes a UAV kinematic model and a decision-making framework under the Markov Decision Process. Second, a continuous control strategy based on the Soft Actor-Critic (SAC) reinforcement learning algorithm is developed to generate precise maneuvering commands. Then, a multi-dimensional situation-coupled reward function is designed, introducing a Health Point (HP) metric to assess situational advantages and simulate cumulative effects quantitatively. Finally, extensive simulations in a custom Gym environment validate the effectiveness of the proposed method and its robustness under both ideal and noisy observation conditions.
Bo LiShuangxia BaiShiyang LiangRui MaEvgeny NeretinJingyi Huang
Jun GuoXuefeng ZhuQingrong Zeng
Junyi MaoHuawei LiangZhiyuan LiJian WangPengfei Zhou