JOURNAL ARTICLE

HACCS: Heterogeneity-Aware Clustered Client Selection for Accelerated Federated Learning

Joel WolfrathNikhil SreekumarDhruv KumarYuanli WangAbhishek Chandra

Year: 2022 Journal:   2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pages: 985-995

Abstract

Federated Learning is a machine learning paradigm where a global model is trained in-situ across a large number of distributed edge devices. While this technique avoids the cost of transferring data to a central location and achieves a strong degree of privacy, it presents additional challenges due to the heterogeneous hardware resources available for training. Furthermore, data is not independent and identically distributed (IID) across all edge devices, resulting in statistical heterogeneity across devices. Due to these constraints, client selection strategies play an important role for timely convergence during model training. Existing strategies ensure that each individual device is included, at least periodically, in the training process. In this work, we propose HACCS, a Heterogeneity-Aware Clustered Client Selection system that identifies and exploits the statistical heterogeneity by representing all distinguishable data distributions instead of individual devices in the training process. HACCS is robust to individual device dropout, provided other devices in the system have similar data distributions. We propose privacy-preserving methods for estimating these client distributions and clustering them. We also propose strategies for leveraging these clusters to make scheduling decisions in a federated learning system. Our evaluation on real-world datasets suggests that our framework can provide 18% −38% reduction in time to convergence compared to the state of the art without any compromise in accuracy.

Keywords:
Computer science Cluster analysis Independent and identically distributed random variables Exploit Machine learning Edge device Scheduling (production processes) Enhanced Data Rates for GSM Evolution Process (computing) Artificial intelligence Data mining Information privacy Distributed computing Cloud computing

Metrics

56
Cited By
6.58
FWCI (Field Weighted Citation Impact)
58
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Privacy-Preserving Technologies in Data
Physical Sciences →  Computer Science →  Artificial Intelligence
Stochastic Gradient Optimization Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Mobile Crowdsensing and Crowdsourcing
Physical Sciences →  Computer Science →  Computer Science Applications
© 2026 ScienceGate Book Chapters — All rights reserved.