Jun ChenJunyu MiChen GuoQing FuWeidong TangWenlang LuoQing Zhu
Mobile edge computing (MEC) systems empowered by energy harvesting (EH) significantly enhance sustainable computing capabilities for mobile devices (MDs). This paper investigates a multi-user multi-server MEC network, in which energy-constrained users dynamically harvest ambient energy to flexibly allocate resources among local computation, task offloading, or intentional task discarding. We formulate a stochastic optimization problem aiming to minimize the time-averaged weighted sum of execution delay, energy consumption, and task discard penalty. To address the energy causality constraints and temporal coupling effects, we develop a Lyapunov optimization-based drift-plus-penalty framework that decomposes the long-term optimization into sequential per-time-slot subproblems. Furthermore, to overcome the curse of dimensionality in high-dimensional action, we propose hierarchical deep reinforcement learning (DRL) solutions incorporating both Q-learning with experience replay and asynchronous advantage actor–critic (A3C) architectures. Extensive simulations demonstrate that our DRL-driven approach achieves lower costs compared with conventional model predictive control methods, while maintaining robust performance under stochastic energy arrivals and channel variations.
Yongli YangQi LiuQinghua ZhuYong LiuWei Han
Li JiHui GaoTiejun LvYueming Lu