standard deviation
Differentially Private Truncation of Unbounded Data via Public Second Moments
Cao, Zilong, Bi, Xuan, Zhang, Hai
Data privacy is important in the AI era, and differential privacy (DP) is one of the golden solutions. However, DP is typically applicable only if data have a bounded underlying distribution. We address this limitation by leveraging second-moment information from a small amount of public data. We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix and applies a principled truncation whose radius depends only on non-private quantities: data dimension and sample size. This transformation yields a well-conditioned second-moment matrix, enabling its inversion with a significantly strengthened ability to resist the DP noise. Furthermore, we demonstrate the applicability of PMT by using penalized and generalized linear regressions. Specifically, we design new loss functions and algorithms, ensuring that solutions in the transformed space can be mapped back to the original domain. We have established improvements in the models' DP estimation through theoretical error bounds, robustness guarantees, and convergence results, attributing the gains to the conditioning effect of PMT. Experiments on synthetic and real datasets confirm that PMT substantially improves the accuracy and stability of DP models.
- North America > United States > Minnesota (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.14)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (0.69)
- Health & Medicine > Therapeutic Area (0.47)
Appendix A Proof of Theorem 2.1
We have the following lemma. Using the notation of Lemma A.1, we have E The third inequality uses the Lipschitz assumption of the loss function. Figure 10 supplements'Relation to disagreement ' at the end of Section 2. It shows an example where the behavior of inconsistency is different from disagreement. All the experiments were done using GPUs (A100 or older). The goal of the experiments reported in Section 3.1 was to find whether/how the predictiveness of The arrows indicate the direction of training becoming longer.
- North America > Canada > British Columbia (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Banking & Finance (0.92)
- Health & Medicine > Therapeutic Area (0.67)
- Information Technology (0.67)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Germany (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Europe > Italy (0.04)
- Energy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- North America > United States (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Fribourg > Fribourg (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
- Health & Medicine (0.93)
- Government (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)