JOURNAL ARTICLE

Dynamic Beam Pattern and Bandwidth Allocation Based on Multi-Agent Deep Reinforcement Learning for Beam Hopping Satellite Systems

Zhiyuan LinZuyao NiLinling KuangChunxiao JiangZhen Huang

Year: 2022 Journal:   IEEE Transactions on Vehicular Technology Vol: 71 (4)Pages: 3917-3930   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Due to the non-uniform geographic distribution and time-varying characteristics of the ground traffic request, how to make full use of the limited beam resources to serve users flexibly and efficiently is a brand-new challenge for beam hopping satellite systems. The conventional greedy-based beam hopping methods do not consider the long-term reward, which is difficult to deal with the time-varying traffic demand. Meanwhile, the heuristic algorithms such as genetic algorithm have a slow convergence time, which can not achieve real-time scheduling. Furthermore, existing methods based on deep reinforcement learning (DRL) only make decisions on beam patterns, lack of the freedom of bandwidth. This paper proposes a dynamic beam pattern and bandwidth allocation scheme based on DRL, which flexibly uses three degrees of freedom of time, space and frequency. Considering that the joint allocation of bandwidth and beam pattern will lead to an explosion of action space, a cooperative multi-agents deep reinforcement learning (MADRL) framework is presented in this paper, where each agent is only responsible for the illumination allocation or bandwidth allocation of one beam. The agents can learn to collaborate by sharing the same reward to achieve the common goal, which refers to maximize the throughput and minimize the delay fairness between cells. Simulation results demonstrate that the offline trained MADRL model can achieve real-time beam pattern and bandwidth allocation to match the non-uniform and time-varying traffic request. Furthermore, when the traffic demand increases, our model has a good generalization ability.

Keywords:
Reinforcement learning Dynamic bandwidth allocation Computer science Bandwidth (computing) Bandwidth allocation Q-learning Channel allocation schemes Communications satellite Scheduling (production processes) Greedy algorithm Beam search Distributed computing Real-time computing Mathematical optimization Artificial intelligence Computer network Satellite Engineering Telecommunications Algorithm Search algorithm

Metrics

141
Cited By
47.01
FWCI (Field Weighted Citation Impact)
42
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Satellite Communication Systems
Physical Sciences →  Engineering →  Aerospace Engineering
Age of Information Optimization
Physical Sciences →  Computer Science →  Computer Networks and Communications
Opportunistic and Delay-Tolerant Networks
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.