noisy sgd
High-Dimensional Privacy-Utility Dynamics of Noisy Stochastic Gradient Descent on Least Squares
Lin, Shurong, Kolaczyk, Eric D., Smith, Adam, Paquette, Elliot
The interplay between optimization and privacy has become a central theme in privacy-preserving machine learning. Noisy stochastic gradient descent (SGD) has emerged as a cornerstone algorithm, particularly in large-scale settings. These variants of gradient methods inject carefully calibrated noise into each update to achieve differential privacy, the gold standard notion of rigorous privacy guarantees. Prior work primarily provides various bounds on statistical risk and privacy loss for noisy SGD, yet the \textit{exact} behavior of the process remains unclear, particularly in high-dimensional settings. This work leverages a diffusion approach to analyze noisy SGD precisely, providing a continuous-time perspective that captures both statistical risk evolution and privacy loss dynamics in high dimensions. Moreover, we study a variant of noisy SGD that does not require explicit knowledge of gradient sensitivity, unlike existing work that assumes or enforces sensitivity through gradient clipping. Specifically, we focus on the least squares problem with $\ell_2$ regularization.
- North America > United States > Pennsylvania (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Privacy-Preserving Federated Convex Optimization: Balancing Partial-Participation and Efficiency via Noise Cancellation
Reshef, Roie, Levy, Kfir Yehuda
This paper tackles the challenge of achieving Differential Privacy (DP) in Federated Learning (FL) under partial-participation, where only a subset of the machines participate in each time-step. While previous work achieved optimal performance in full-participation settings, these methods struggled to extend to partial-participation scenarios. Our approach fills this gap by introducing a novel noise-cancellation mechanism that preserves privacy without sacrificing convergence rates or computational efficiency. We analyze our method within the Stochastic Convex Optimization (SCO) framework and show that it delivers optimal performance for both homogeneous and heterogeneous data distributions. This work expands the applicability of DP in FL, offering an efficient and practical solution for privacy-preserving learning in distributed systems with partial participation.
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
- Information Technology > Data Science > Data Mining > Big Data (0.60)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Mixing Times and Privacy Analysis for the Projected Langevin Algorithm under a Modulus of Continuity
Bravo, Mario, Flores-Mella, Juan P., Guzmán, Cristóbal
We study the mixing time of the projected Langevin algorithm (LA) and the privacy curve of noisy Stochastic Gradient Descent (SGD), beyond nonexpansive iterations. Specifically, we derive new mixing time bounds for the projected LA which are, in some important cases, dimension-free and poly-logarithmic on the accuracy, closely matching the existing results in the smooth convex case. Additionally, we establish new upper bounds for the privacy curve of the subsampled noisy SGD algorithm. These bounds show a crucial dependency on the regularity of gradients, and are useful for a wide range of convex losses beyond the smooth case. Our analysis relies on a suitable extension of the Privacy Amplification by Iteration (PABI) framework (Feldman et al., 2018; Altschuler and Talwar, 2022, 2023) to noisy iterations whose gradient map is not necessarily nonexpansive. This extension is achieved by designing an optimization problem which accounts for the best possible R\'enyi divergence bound obtained by an application of PABI, where the tractability of the problem is crucially related to the modulus of continuity of the associated gradient mapping. We show that, in several interesting cases -- including the nonsmooth convex, weakly smooth and (strongly) dissipative -- such optimization problem can be solved exactly and explicitly. This yields the tightest possible PABI-based bounds, where our results are either new or substantially sharper than those in previous works.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
Data Deletion for Linear Regression with Noisy SGD
Xia, Zhangjie, Wang, Chi-Hua, Cheng, Guang
In the current era of big data and machine learning, it's essential to find ways to shrink the size of training dataset while preserving the training performance to improve efficiency. However, the challenge behind it includes providing practical ways to find points that can be deleted without significantly harming the training result and suffering from problems like underfitting. We therefore present the perfect deleted point problem for 1-step noisy SGD in the classical linear regression task, which aims to find the perfect deleted point in the training dataset such that the model resulted from the deleted dataset will be identical to the one trained without deleting it. We apply the so-called signal-to-noise ratio and suggest that its value is closely related to the selection of the perfect deleted point. We also implement an algorithm based on this and empirically show the effectiveness of it in a synthetic dataset. Finally we analyze the consequences of the perfect deleted point, specifically how it affects the training performance and privacy budget, therefore highlighting its potential. This research underscores the importance of data deletion and calls for urgent need for more studies in this field.
- North America > United States > California (0.14)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Security & Privacy (1.00)
- Law (0.68)
- Government > Regional Government (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)
Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates
Zhang, Jingwei, Huang, Xunpeng, Yu, Jincheng
Gradient-based optimization is a fundamental tool in machine learning and has witnessed great empirical success for training neural networks, despite the highly non-convexity landscape of the objective. However, theoretical understanding of nonconvex optimization in neural networks is quite limited. Until recently, there has been much work that explains the success of gradientbased optimization in overparametrized neural networks, that is neural networks with massive hidden units. Under the overparametrization condition, the learning problem can be translated into minimizing a convex functional and hence circumventing the difficulties of analyzing non-convex objectives. It's worth mentioning that there has been much broader interest in analyzing the convergence of machine learning algorithms by formulating it as the problem of minimizing some (usually convex) functional of a measure, such as variational inference (Liu and Wang (2016); Liu (2017); Chewi et al. (2020)), generative adversatial networks (Johnson and Zhang (2019); Nitanda and Suzuki (2020)) and learning infinite-width neural networks (Chizat and Bach (2018); Mei et al. (2018); Nguyen and Pham (2020); Fang et al. (2021)). The key idea is by approximating the learning dynamics of model parameters by the optimization on the space of probability of measures over the model parameters under the overparametrization condition.
- North America > United States > Texas > Clay County (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Middle East > Jordan (0.04)
On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches
Abadi, Martín, Erlingsson, Úlfar, Goodfellow, Ian, McMahan, H. Brendan, Mironov, Ilya, Papernot, Nicolas, Talwar, Kunal, Zhang, Li
In their classic tutorial paper, Saltzer and Schroeder described the mechanics of protecting information in computer systems, as it was understood in the mid 1970s [1]. They were interested, in particular, in mechanisms for achieving privacy, which they defined as follows: The term "privacy" denotes a socially defined ability of an individual (or organization) to determine whether, when, and to whom personal (or organizational) information is to be released. Saltzer and Schroeder took "security" to refer to the body of techniques for controlling the use or modification of computers or information. In this sense, security is an essential element of guaranteeing privacy. Their definitions are roughly in line with our current ideas, perhaps because they helped shaped those ideas.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)