Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing

Yongbo Yu; Fuxun Yu; Zirui Xu; Di Wang; Minjia Zhang; Ang Li; Shawn Bray; Chenchen Liu; Xiang Chen

doi:10.1145/3487553.3524859

ScienceGate Book Chapters

JOURNAL ARTICLE

Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing

Yongbo Yu Fuxun Yu Zirui Xu Di Wang Minjia Zhang Ang Li Shawn Bray Chenchen Liu Xiang Chen

Year: 2022 Journal: Companion Proceedings of the Web Conference 2022 Pages: 567-571

DOI: 10.1145/3487553.3524859

Get Full-Text PDF Get Analytical Report

Abstract

Federated learning (FL) nowadays involves compound learning tasks as cognitive applications' complexity increases. For example, a self-driving system hosts multiple tasks simultaneously (e.g., detection, classification, etc.) and expects FL to retain life-long intelligence involvement. However, our analysis demonstrates that, when deploying compound FL models for multiple training tasks on a GPU, certain issues arise: (1) As different tasks' skewed data distributions and corresponding models cause highly imbalanced learning workloads, current GPU scheduling methods lack effective resource allocations; (2) Therefore, existing FL schemes, only focusing on heterogeneous data distribution but runtime computing, cannot practically achieve optimally synchronized federation. To address these issues, we propose a full-stack FL optimization scheme to address both intra-device GPU scheduling and inter-device FL coordination for multi-task training. Specifically, our works illustrate two key insights in this research domain: (1) Competitive resource sharing is beneficial for parallel model executions, and the proposed concept of "virtual resource" could effectively characterize and guide the practical per-task resource utilization and allocation. (2) FL could be further improved by taking architectural level coordination into consideration. Our experiments demonstrate that the FL throughput could be significantly escalated.

Keywords:

Computer science Scheduling (production processes) Distributed computing Shared resource Task (project management) Implementation Computer architecture Human–computer interaction Artificial intelligence Computer network Software engineering

Metrics

Cited By

0.35

FWCI (Field Weighted Citation Impact)

Refs

0.52

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Privacy-Preserving Technologies in Data

Physical Sciences → Computer Science → Artificial Intelligence

Stochastic Gradient Optimization Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Cryptography and Data Security

Physical Sciences → Computer Science → Artificial Intelligence

Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing

Abstract

Metrics

Citation History

Topics

Related Documents

FedMT: Multi-Task Federated Learning with Competitive GPU Resource Sharing

FedMT: Multi-Task Federated Learning with Competitive GPU Resource Sharing

UAV-Assisted Multi-Task Federated Learning with Task Knowledge Sharing

Communication-Efficient Federated Multi-Task Learning with Sparse Sharing

Multi-Tenant Deep Learning Acceleration with Competitive GPU Resource Sharing