JOURNAL ARTICLE

Can hierarchical client clustering mitigate the data heterogeneity effect in federated learning?

Abstract

Federated learning (FL) was proposed for training a deep neural network model using millions of user data. The technique has attracted considerable attention owing to its privacy-preserving characteristic. However, two major challenges exist. The first is the limitation of simultaneously participating clients. If the number of clients increases, the single parameter server easily becomes a bottleneck and is prone to have stragglers. The second is data heterogeneity, which adversely affects the accuracy of the global model. Because data should remain at user devices to preserve privacy, we cannot use data shuffling, which is used to homogenize training data in traditional distributed deep learning. We propose a client clustering and model aggregation method, CCFed, to increase the number of simultaneously participating clients and mitigate the data heterogeneity problem. CCFed improves the learning performance using set partition modeling to let data be evenly distributed between clusters and mitigate the effect of a non-IID environment. Experiments show that we can achieve a 2.7-14% higher accuracy using CCFed compared with FedAvg, where CCFed requires approximately 50% less number of rounds compared with FedAvg training on benchmark datasets.

Keywords:
Computer science Bottleneck Shuffling Cluster analysis Partition (number theory) Benchmark (surveying) Federated learning Data mining Training set Machine learning Deep learning Data modeling Data set Information privacy Set (abstract data type) Artificial neural network Artificial intelligence Database

Metrics

2
Cited By
0.51
FWCI (Field Weighted Citation Impact)
36
Refs
0.65
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Privacy-Preserving Technologies in Data
Physical Sciences →  Computer Science →  Artificial Intelligence
Traffic Prediction and Management Techniques
Physical Sciences →  Engineering →  Building and Construction
Vehicular Ad Hoc Networks (VANETs)
Physical Sciences →  Engineering →  Electrical and Electronic Engineering

Related Documents

JOURNAL ARTICLE

Robust and Scalable Federated Learning Framework for Client Data Heterogeneity Based on Optimal Clustering

Zihan LiShuai YuanZhitao Guan

Journal:   Journal of Parallel and Distributed Computing Year: 2024 Vol: 195 Pages: 104990-104990
JOURNAL ARTICLE

Client Selection in Hierarchical Federated Learning

Silvana TrindadeNelson L. S. da Fonseca

Journal:   IEEE Internet of Things Journal Year: 2024 Vol: 11 (17)Pages: 28480-28495
JOURNAL ARTICLE

Federated Learning Client Selection Mechanism Under System and Data Heterogeneity

Fa XinJinghui ZhangJunzhou LuoFang Dong

Journal:   2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD) Year: 2022 Pages: 1239-1244
© 2026 ScienceGate Book Chapters — All rights reserved.