Privacy Preserving Diffusion Models for Mixed-Type Tabular Data Generation
Sattarov, Timur, Schreyer, Marco, Borth, Damian
–arXiv.org Artificial Intelligence
We introduce DP-FinDiff, a differentially private diffusion framework for synthesizing mixed-type tabular data. DP-FinDiff employs embedding-based representations for categorical features, reducing encoding overhead and scaling to high-dimensional datasets. To adapt DP-training to the diffusion process, we propose two privacy-aware training strategies: an adaptive timestep sampler that aligns updates with diffusion dynamics, and a feature-aggregated loss that mitigates clipping-induced bias. Together, these enhancements improve fidelity and downstream utility without weakening privacy guarantees. On financial and medical datasets, DP-FinDiff achieves 16-42% higher utility than DP baselines at comparable privacy levels, demonstrating its promise for safe and effective data sharing in sensitive domains.
arXiv.org Artificial Intelligence
Dec-2-2025
- Country:
- Europe
- Germany > Hesse
- Darmstadt Region > Frankfurt (0.04)
- Switzerland
- Bern > Bern (0.04)
- St. Gallen > St. Gallen (0.04)
- Germany > Hesse
- Europe
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Health & Medicine (0.94)
- Information Technology > Security & Privacy (1.00)
- Technology: