JOURNAL ARTICLE

Share-Aware Joint Model Deployment and Task Offloading for Multi-Task Inference

Yalan WuJigang WuLong ChenBosheng LiuMianyang YaoSiew-Kei Lam

Year: 2024 Journal:   IEEE Transactions on Intelligent Transportation Systems Vol: 25 (6)Pages: 5674-5687   Publisher: Institute of Electrical and Electronics Engineers

Abstract

In vehicular edge computing, efficient strategies for model deployment and task offloading offer tremendous potential to reduce response time for machine learning inference. However, existing works do not pay much attention to that there are shared structures among different types of inference tasks. This limits the improvement in response time. This paper aims to fill this gap by investigating a share-aware joint model deployment and task offloading problem for multi-task inference in vehicular edge computing. We formulate the problem with an objective to minimize the total response time of all inference requests, under constraints of per task response time, per roadside unit storage capacity, etc. We prove that the formulated problem is NP-hard. To solve the problem, a time period aware algorithm, called TPA, is proposed with guaranteed approximation ratio. In TPA, an iterative approach is designed to solve the problem of maximizing system throughput during a certain time period. Then, the certain time period approximates to the minimum time period of completing all requests. The algorithms are evaluated in the environment comprising two CPUs, two GPUs, state-of-the-art multi-task learning models and the dataset of Google cluster-usage trace. Simulation results derived from this environment show that, the proposed TPA outperforms the state-of-the-art methods for all cases, in terms of the total response time of all requests. For example, TPA can significantly reduce the total response time by at least $73.72\%$ for different numbers of RSUs considered, compared with state-of-the-art methods.

Keywords:
Inference Computer science Software deployment Task (project management) Throughput Enhanced Data Rates for GSM Evolution Response time Artificial intelligence Wireless Engineering

Metrics

11
Cited By
7.03
FWCI (Field Weighted Citation Impact)
46
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Privacy-Preserving Technologies in Data
Physical Sciences →  Computer Science →  Artificial Intelligence
Age of Information Optimization
Physical Sciences →  Computer Science →  Computer Networks and Communications
IoT and Edge/Fog Computing
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.