Mengmeng ShaoJunjie YanXiaohui Zhao
In recent years, unmanned aerial vehicles (UAVs) have been widely used in wireless communications due to their low cost, small size, flexible deployment, and mobile controllability. However, because of the Line-of-Sight (LoS) communication links, the security threat is always a challenging problem to deal with. In particular, information stolen and leakage may happen in the presence of eavesdroppers. This article proposes a UAV-enabled system with a relay UAV and a jammer UAV, and certain mobile source and destination nodes in the presence of an eavesdropper to solve the secrecy rate maximization problem. In this system, the relay UAV transmits information between pairs of moving source nodes and moving destination nodes with interrupted communication channels due to blockage or long distance, and the jammer UAV interferes with eavesdropper to reduce the milked information through sending jamming signals. We establish an average secrecy rate maximization problem with trajectory and transmit power optimization under certain constraints for this system. Since this problem is nonconvex and reformulated as the Markov decision process (MDP), we use deep reinforcement learning (DRL) method to solve it. In this article, we adopt a proximal policy optimization (PPO) algorithm to find an optimal solution because it can deal with the model of continuous action space. According to our defined states, rewards, and actions in this specified MDP, this algorithm can autonomously learn to optimize the trajectory and power allocation of the UAVs to realize our goal. Simulation results demonstrate that the proposed PPO-based average secrecy rate maximization algorithm is valid, effective and scalable.
Zhen XueQihui WuZhiyong FengCaijun ZhongGuoru Ding
Shuangyu LuoJiangyuan LiAthina P. Petropulu
Christantus O. NnamaniMuhammad R. A. KhandakerMathini Sellathurai
Xin LiuYingfeng YuBao PengXiangping ZhaiQiuming ZhuVictor C. M. Leung