Chong ZhengYongming HuangCheng ZhangTony Q. S. Quek
Mobile edge computing (MEC) combining radio access network (RAN) slicing shows tremendous potential in satisfying diverse service level agreement (SLA) demands in future wireless communications. Since the limited computing capacities of MEC servers as well as the limited transmission capacities of wireless communication links, efficient hybrid resource allocation (RA) from the perspective of computing and transmission resources is crucial to maintain a high SLA satisfaction rate (SSR). However, in cooperative multi-node MEC-assisted RAN slicing systems, the complexity of the multi-node cooperation in spatial dimension as well as the contextual correlation of system state in time dimension pose significant challenges to the hybrid RA policy optimization. In this paper, we aim to maximize the SSR for heterogeneous service demands in the cooperative MEC-assisted RAN slicing system by jointly considering the multi-node computing resources cooperation and allocation, the transmission resource blocks (RBs) allocation, and the time-varying dynamicity of the system. To this end, we abstract the system into a weighted undirected topology graph and, then propose a recurrent graph reinforcement learning (RGRL) algorithm to intelligently learn the optimal hybrid RA policy. Therein, the graph neural network (GCN) and the deep deterministic policy gradient (DDPG) are combined to effectively extract spatial features from the equivalent topology graph. Furthermore, a novel time recurrent reinforcement learning framework is designed in the proposed RGRL algorithm by incorporating the action output of the policy network at the previous moment into the state input of the policy network at the subsequent moment, so as to cope with the time-varying and contextual network environment. In addition, we explore two use case scenarios to discuss the universal superiority of the proposed RGRL algorithm. Simulation results demonstrate the superiority of the proposed algorithm in terms of the average SSR, the performance stability, and the network complexity.
Yong ZhangHao ZhangYan LiYong ZhangSiyu Yuan
Zahraa Zakariya SalehMaysam AbbodR. Nilavalan
Qiang LiuTao HanNing ZhangYe Wang
Demeke Shumeye LakewAnh-Tien TranNhu‐Ngoc DaoSungrae Cho