Debiasing Synthetic Data Generated by Deep Generative Models Ghent University Hospital - SYNDARA Ghent University Hospital - SYNDARA Paloma Rabaey

Neural Information Processing Systems 

While synthetic data hold great promise for privacy protection, their statistical analysis poses significant challenges that necessitate innovative solutions. The use of deep generative models (DGMs) for synthetic data generation is known to induce considerable bias and imprecision into synthetic data analyses, compromising their inferential utility as opposed to original data analyses. This bias and uncertainty can be substantial enough to impede statistical convergence rates, even in seemingly straightforward analyses like mean calculation.