Reinforcement learning based unmanned aerial vehicle (UAV) swarm communications have to address the challenges raised by the large-scale dynamic network and strong jamming and interference. In this paper, we propose a multiagent reinforcement learning based UAV swarm anti-jamming communication scheme to optimize the UAV relay selection and power allocation based on the network topology, channel states, previous performance and the network states shared by neighboring UAVs. This scheme formulates the policy distribution to improve the policy space exploration and designs a soft learning mechanism to guide the policy update and stabilize the learning process. According to transfer learning, the shared swarm experiences are exploited to accelerate the initial policy learning. We investigate the computational complexity of the proposed scheme and derive the performance bound regarding the message bit error rate, the swarm energy consumption and the utility. Simulation results show that the proposed scheme improves the swarm communication performance and saves energy consumption compared with the benchmark scheme.
Zefang LvLiang XiaoYousong DuGuohang NiuChengwen XingWenyuan Xu
Zhiping LinXiaohao YanLiang XiaoShi YanYuliang TangJun Liu
Xiaozhen LuLiang XiaoCanhuang DaiHuaiyu Dai