JOURNAL ARTICLE

Deep Reinforcement Learning for Trajectory Design and Power Allocation in UAV Networks

Abstract

Unmanned aerial vehicle (UAV) is considered to be a key component in the next-generation cellular networks. Considering the non-convex characteristic of the trajectory design and power allocation problem, it is difficult to obtain the optimal joint strategy in UAV-assisted cellular networks. In this paper, a reinforcement learning-based approach is proposed to obtain the maximum long-term network utility while meeting with user equipments' quality of service requirement. The Markov decision process (MDP) is formulated with the design of state, action space, and reward function. In order to achieve the joint optimal policy of trajectory design and power allocation, deep reinforcement learning approach is investigated. Due to the continuous action space of the MDP model, deep deterministic policy gradient approach is presented. Simulation results show that the proposed algorithm outperforms other approaches on overall network utility performance with higher system capacity and faster processing speed.

Keywords:
Reinforcement learning Markov decision process Computer science Trajectory Q-learning Mathematical optimization State space Markov process Component (thermodynamics) Function (biology) Artificial intelligence Mathematics

Metrics

27
Cited By
5.38
FWCI (Field Weighted Citation Impact)
21
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

UAV Applications and Optimization
Physical Sciences →  Engineering →  Aerospace Engineering
Distributed Control Multi-Agent Systems
Physical Sciences →  Computer Science →  Computer Networks and Communications
Smart Parking Systems Research
Physical Sciences →  Engineering →  Building and Construction
© 2026 ScienceGate Book Chapters — All rights reserved.