Unmanned aerial vehicles (UAVs) are widely used as aerial base stations (BSs) to provide wireless communication services. In this paper, we consider the UAV's trajectory and power allocation design for downlink communication rate maximization in a UAV-enabled network in disaster areas or the cell fringe. Non-orthogonal multiple access (NOMA) is used to improve the spectrum efficiency of the entire network, while all users are roaming around randomly. The formulated problem is non-convex and the considered environment is dynamic. Such a problem is difficult to be solved via conventional optimization methods. Therefore, we propose a soft actor-critic (SAC) learning scheme to tackle the pertinent problem. Simulation results show that our proposed learning framework is more stable and has a faster convergence rate compared to baseline approaches.
Xincheng YangDanyang QinJiping LiuYue LiYong ZhuLin Ma
Ruikang ZhongYuanwei LiuXidong MuYue ChenLingyang Song
Haowen SunMing ChenYijin PanYihan CangJiahui ZhaoYuanzhi Sun