Machine learning is becoming an increasingly integral component of mobile applications. However, the execution of compute-heavy neural models (e.g., for computer vision tasks) on resource-constrained devices is challenging due to their limited computing power, memory, and energy reservoir. While edge computing mitigates these issues, the transfer of information-rich signals over capacity-limited and time-varying wireless channels may result in large latency and latency variations. Herein, we propose a methodology to route heterogeneous tasks across the resources and layers of systems composed of mobile devices and edge servers. Different from prior work, we consider aspects of real-world systems, such as context switching, task accumulation, and the interplay between communications and computing components of the overall pipeline, that are rarely captured in abstract models. To optimize the task flow, we use a deep reinforcement learning agent trained on real-world data collected using a system we developed. The agent uses an articulate definition of state drawing features from several logical blocks of the system. Results indicate that the agent adapts the routing of tasks to parameters controlling their heterogeneity, as well as the hardware setup and the state of the wireless channel.
Sanjaya Kumar PandaIndrajeet GuptaPrasanta K. Jana
Dongyang LiuJunhua ChenXueda HuangHaojun Hong
Mohannad NabelseeAnselm BusseHelge ParzyjeglaGero Mühl