Partition-based differentially private synthetic data generation
Zhang, Meifan, Deng, Dihang, Yin, Lihua
–arXiv.org Artificial Intelligence
Private synthetic data sharing is preferred as it keeps the distribution and nuances of original data compared to summary statistics. The state-of-the-art methods adopt a select-measure-generate paradigm, but measuring large domain marginals still results in much error and allocating privacy budget iteratively is still difficult. To address these issues, our method employs a partition-based approach that effectively reduces errors and improves the quality of synthetic data, even with a limited privacy budget. Results from our experiments demonstrate the superiority of our method over existing approaches. The synthetic data produced using our approach exhibits improved quality and utility, making it a preferable choice for private synthetic data sharing.
arXiv.org Artificial Intelligence
Oct-10-2023
- Country:
- Asia
- China
- Beijing > Beijing (0.04)
- Guangdong Province
- Jiangsu Province (0.04)
- Macao (0.04)
- China
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Nevada (0.04)
- New York > New York County
- New York City (0.04)
- Canada > Ontario
- South America > Brazil (0.04)
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning (1.00)
- Data Science > Data Mining (0.67)
- Security & Privacy (1.00)
- Information Technology