Partition-based differentially private synthetic data generation

Oct-10-2023–arXiv.org Artificial Intelligence

Private synthetic data sharing is preferred as it keeps the distribution and nuances of original data compared to summary statistics. The state-of-the-art methods adopt a select-measure-generate paradigm, but measuring large domain marginals still results in much error and allocating privacy budget iteratively is still difficult. To address these issues, our method employs a partition-based approach that effectively reduces errors and improves the quality of synthetic data, even with a limited privacy budget. Results from our experiments demonstrate the superiority of our method over existing approaches. The synthetic data produced using our approach exhibits improved quality and utility, making it a preferable choice for private synthetic data sharing.

contribution, privacy budget, synthetic data, (13 more...)

arXiv.org Artificial Intelligence

Oct-10-2023

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.04)
- North America
  - United States
    - Nevada (0.04)
    - New York > New York County
      - New York City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe > Ireland
  - Leinster > County Dublin > Dublin (0.04)
- Asia
  - Macao (0.04)
  - China
    - Jiangsu Province (0.04)
    - Beijing > Beijing (0.04)
    - Guangdong Province
      - Guangzhou (0.04)
      - Shenzhen (0.04)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found