JOURNAL ARTICLE

Federated Generation of Synthetic Tabular Data

Abstract

Machine learning (ML) models have been demonstrated to be beneficial in various domains. However, their application remains severely limited due to concerns about (1) using personal data for training ML models and (2) exchanging data between different organizations, like hospitals and banks. Both cases might lead to privacy breaches and disclosure of sensitive information. In this work, we tackle both problems simultaneously by generating synthetic data in a federated learning manner. Previous work in this field primarily addresses image data generation, while we focus on tabular data, which is more relevant for sensitive data domains.In particular, we proposed adapting two centralized tabular data generation methods, Bayesian Networks and Variational Autoencoders, to the federated setting with a novel aggregation approach applied specifically to Bayesian Networks. We perform an exhaustive evaluation of the generated synthetic on three datasets in terms of fidelity, utility, and privacy. Further, we demonstrate how the data performance changes depending on data partition among clients participating in federated learning and how the number of clients impacts the results. Our results suggest that, in many cases, the proposed methods in federated settings perform similarly to those in centralized settings and outperform local data generation. However, the imbalance among clients significantly affects the synthetic data generated by Variational Autoencoders.

Keywords:
Federated learning Partition (number theory) Synthetic data Field (mathematics) Focus (optics) Training set Information privacy Bayesian probability

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.36
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Privacy-Preserving Technologies in Data
Physical Sciences →  Computer Science →  Artificial Intelligence
Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Explainable Artificial Intelligence (XAI)
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

DISSERTATION

Generation and Evaluation of Realistic Tabular Synthetic Data

Lautrup, Anton Danholt

University:   University of Southern Denmark Research Portal (University of Southern Denmark) Year: 2025
JOURNAL ARTICLE

Differentially Private Normalizing Flows for Synthetic Tabular Data Generation

Jae Wook LeeM. KimYonghyun JeongYoungmin Ro

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2022 Vol: 36 (7)Pages: 7345-7353
© 2026 ScienceGate Book Chapters — All rights reserved.