The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging
Ganev, Georgi, Annamalai, Meenatchi Sundaram Muthu Selva, De Cristofaro, Emiliano
–arXiv.org Artificial Intelligence
Privacy-preserving synthetic data has been increasingly adopted to share data within and across organizations while reducing privacy risks. The intuition is to train a generative model on the real data, draw samples from the model, and create new (synthetic) data points. As the original data may contain sensitive and/or personal information, synthetic data can be vulnerable to membership/property inference, reconstruction attacks, etc. [6, 13, 25, 29, 30, 57]. Thus, models should be trained to satisfy robust definitions like Differential Privacy (DP) [19, 20], which bounds the privacy leakage from the synthetic data. Combining generative models with DP has been advocated for or deployed by government agencies [2, 31, 46, 62], regulatory bodies [60, 61], and non-profit organizations [48, 63].
arXiv.org Artificial Intelligence
Jun-20-2024
- Country:
- Asia > Middle East > Israel (0.04)
- Genre:
- Research Report (0.83)
- Industry:
- Health & Medicine (0.97)
- Information Technology > Security & Privacy (0.88)
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning
- Ensemble Learning (0.68)
- Neural Networks (1.00)
- Performance Analysis > Accuracy (0.94)
- Statistical Learning (1.00)
- Data Science (1.00)
- Artificial Intelligence > Machine Learning
- Information Technology