Synthetic data generation has rapidly emerged as a cornerstone technology for achieving privacy-preserving artificial intelligence (AI). In light of tightening data protection regulations and the growing ethical emphasis on safeguarding personal information, researchers have developed a range of methods to synthesize realistic datasets without compromising individual privacy. This review presents a comprehensive synthesis of existing approaches, focusing on generative adversarial networks (GANs), variational autoencoders (VAEs), and Bayesian techniques. We systematically evaluate these models based on data utility, privacy guarantees, and vulnerability to adversarial attacks. Despite significant progress, challenges such as utility-privacy trade-offs, model bias, and lack of standard evaluation metrics persist. This paper highlights these gaps and proposes strategic future directions for the research community, advocating for hybrid models, interpretability-focused synthetic generation, and cross-disciplinary collaborations to achieve more trustworthy AI ecosystems.
Jill-Jênn VieTomas RigauxSein Minn
Pentyala, SikhaMenzies, ShaneDe Cock, Martine
Pentyala, SikhaMenzies, ShaneDe Cock, Martine
Fan LiuZhiyong ChengHuilin ChenYinwei WeiLiqiang NieMohan Kankanhalli