JOURNAL ARTICLE

PipeCo: Pipelining Cold Start of Deep Learning Inference Services on Serverless Platforms

Jiaang DuanShiyou QianHanwen HuDingyu YangJian CaoGuangtao Xue

Year: 2025 Journal:   Proceedings of the ACM on Measurement and Analysis of Computing Systems Vol: 9 (2)Pages: 1-23   Publisher: Association for Computing Machinery

Abstract

The fusion of serverless computing and deep learning (DL) has led to serverless inference, offering a promising approach for developing and deploying scalable and cost-efficient deep learning inference services (DLISs). However, the challenge of cold start presents a significant obstacle for DLISs, where DL model size greatly impacts latency. Existing studies mitigate cold starts by extending keep-alive times, which unfortunately leads to decreased resource utilization efficiency. To address this issue, we introduce PipeCo, a system designed to alleviate DLIS cold start. The core concept of PipeCo is to achieve the miniaturization and pipelining of DLIS cold start. Firstly, PipeCo utilizes a vertical partitioning approach to divide each DLIS into multiple slices, prewarming slices in a sequential and overlapping manner to decrease the overall cold-start latency. Secondly, PipeCo employs an attention-based prediction mechanism to estimate periodic patterns in requests and idle containers for scheduling slices. Thirdly, PipeCo incorporates a similarity-based container matcher for the reuse of idle containers. We implemented a prototype of PipeCo on the OpenFaaS platform and conducted extensive experiments using three real-world DLIS repositories. The results demonstrate that PipeCo effectively decreases end-to-end (E2E) latency by up to 62.67% on CPU and 58.81% on GPU clusters and reduces the overall resource usage by 65.31% compared to five state-of-the-art baselines.

Keywords:
Inference Computer science Deep learning Cold start (automotive) Artificial intelligence Computer architecture Engineering

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
42
Refs
0.19
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Data Storage Technologies
Physical Sciences →  Computer Science →  Computer Networks and Communications
Stock Market Forecasting Methods
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Traffic Prediction and Management Techniques
Physical Sciences →  Engineering →  Building and Construction
© 2026 ScienceGate Book Chapters — All rights reserved.