AITopics | dp-sgd

Collaborating Authors

dp-sgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Escaping Iterative Parameter-Space Noise: Differentially Private Learning with a Hypernetwork

Nishikawa, Naoki, Takakura, Shokichi, Hasegawa, Satoshi

arXiv.org Machine LearningJun-26-2026

Differentially private (DP) training of neural networks is often hindered by the large amount of noise required by gradient-based methods such as DP-SGD, which repeatedly inject high-dimensional noise in parameter space throughout training. In this paper, we propose a new framework for DP learning that avoids iterative optimization in parameter space. Instead of updating the target model using privatized gradients, we employ a hypernetwork trained on public datasets to map a private dataset to the parameters of the target model. Specifically, each example is embedded into a low-dimensional representation, the embeddings are aggregated and perturbed to obtain a DP dataset embedding, and the hypernetwork generates the target model parameters from this noisy embedding. Because privacy noise is injected only once into a low-dimensional dataset representation, our approach can significantly reduce the adverse effect of noise. We theoretically show in a synthetic setting that, under a fixed privacy budget, models produced by our approach achieve higher utility than those trained with DP-SGD. Moreover, we apply our approach to LoRA fine-tuning of diffusion models and show that it achieves lower FID than LoRA models trained with DP-SGD and other public-data-guided methods.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2606.26772

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Deep Learning with Plausible Deniability

Neural Information Processing SystemsJun-23-2026, 12:32:13 GMT

Deep learning models are vulnerable to privacy attacks due to their tendency to memorize individual training examples. Theoretically-sound defenses such as differential privacy can defend against this threat, but model performance often suffers. Empirical defenses may thwart existing attacks while maintaining model performance but do not offer any robust theoretical guarantees. In this paper, we explore a new strategy based on the concept of plausible deniability. We introduce a training algorithm called Plausibly Deniable Stochastic Gradient Descent (PD-SGD). The core of this approach is a rejection sampling technique, which probabilistically prevents updating model parameters whenever a mini-batch cannot be plausibly denied. We provide theoretical results showing that PD-SGD effectively mitigates privacy leakage from individual data points. Experiments demonstrate the scalability of PD-SGD and the favorable privacy-utility trade-off it offers compared to existing defense methods.

artificial intelligence, machine learning, pd-sgd, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Private Training Large-scale Models with Efficient DP-SGD

Neural Information Processing SystemsJun-15-2026, 18:58:14 GMT

As large language models (LLMs) increasingly underpin technological advancements, the privacy of their training data emerges as a critical concern. Differential Privacy (DP) serves as a rigorous mechanism to protect this data, yet its integration via Differentially Private Stochastic Gradient Descent (DP-SGD) introduces substantial challenges, primarily due to the complexities of per-sample gradient clipping.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA

Neural Information Processing SystemsJun-13-2026, 20:22:57 GMT

Low-Rank Adaptation (LoRA), which introduces a product of two trainable low-rank matrices into frozen pre-trained weights, is widely used for efficient fine-tuning of language models in federated learning (FL). However, when combined with differentially private stochastic gradient descent (DP-SGD), LoRA faces substantial noise amplification: DP-SGD perturbs per-sample gradients, and the matrix multiplication of the LoRA update ($BA$) intensifies this effect. Freezing one matrix (*e.g.*, $A$) reduces the noise but restricts model expressiveness, often resulting in suboptimal adaptation. To address this, we propose $\texttt{FedSVD}$, a simple yet effective method that introduces a global reparameterization based on singular value decomposition (SVD).

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback

Differentially Private Relational Learning with Entity-level Privacy Guarantees

Neural Information Processing SystemsJun-12-2026, 22:45:45 GMT

Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Databases (0.83)

Add feedback

Private Zeroth-Order Optimization with Public Data

Neural Information Processing SystemsJun-12-2026, 05:49:54 GMT

One of the major bottlenecks for deploying popular first-order differentially private (DP) machine learning algorithms (e.g., DP-SGD) lies in their high computation and memory cost, despite the existence of optimized implementations. Zeroth-order methods have promise in mitigating the overhead, as they leverage function evaluations to approximate the gradients, hence significantly easier to privatize. While recent works have explored zeroth-order approaches in both private and non-private settings, they still suffer from relatively low utilities compared with DP-SGD, and have only been evaluated in limited application domains. In this work, we propose to leverage public information to guide and improve gradient approximation of private zeroth-order algorithms.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Private Training Large-scale Models with Efficient DP-SGD

Neural Information Processing SystemsJun-11-2026, 07:41:45 GMT

large language model, machine learning, natural language, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback

GeoClip: Geometry-Aware Clipping for Differentially Private SGD

Neural Information Processing SystemsJun-10-2026, 00:50:39 GMT

Differentially private stochastic gradient descent (DP-SGD) is the most widely used method for training machine learning models with provable privacy guarantees. A key challenge in DP-SGD is setting the per-sample gradient clipping threshold, which significantly affects the trade-off between privacy and utility. While recent adaptive methods improve performance by adjusting this threshold during training, they operate in the standard coordinate system and fail to account for correlations across the coordinates of the gradient. We propose GeoClip, a geometry-aware framework that clips and perturbs gradients in a transformed basis aligned with the geometry of the gradient distribution. GeoClip adaptively estimates this transformation using only previously released noisy gradients, incurring no additional privacy cost. We provide convergence guarantees for GeoClip and derive a closed-form solution for the optimal transformation that minimizes the amount of noise added while keeping the probability of gradient clipping under control. Experiments on both tabular and image datasets demonstrate that GeoClip consistently outperforms existing adaptive clipping methods under the same privacy budget.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

When Do Fewer Coordinates Suffice in DP-SGD?

Zhang, Huiqi, Xie, Fang

arXiv.org Machine LearningJun-4-2026

Differentially private stochastic gradient descent (DP-SGD) injects noise into every updated coordinate, making the injected noise energy scale with the ambient parameter dimension $d$. We ask when private training can update fewer coordinates without losing the signal needed for optimization. We propose \textsc{TP-TopK} (Two-Phase TopK DP-SGD), a two-phase method for coordinate-sparse private training without public data, in which a private warm-up phase identifies a coordinate support used to guide the main training phase. We give a criterion characterizing when coordinate restriction can be beneficial, show via a nonconvex stationarity bound that under this condition the relevant noise term scales with the active dimension $k$ rather than the full parameter dimension $d$, and provide a lower bound on the reliability of warm-up-based coordinate ranking. Experiments on MNIST, FMNIST, and CIFAR-10 show that learned coordinate supports can retain more gradient energy than size-matched random supports, with the largest gains when the active dimension is small and warm-up scores are informative.

artificial intelligence, machine learning, phase 2, (17 more...)

arXiv.org Machine Learning

2606.04375

Country: Asia > China (0.46)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD

Lampert, Christoph H., Zakerinia, Hossein

arXiv.org Machine LearningMay-27-2026

Understanding the relationship between generalization and privacy remains a central challenge in modern machine learning theory, particularly for deep networks trained by variants of differentially private stochastic gradient descent (DP-SGD). In this work we make progress on this persistent open problem by proving a finite-sample bound on the approximate max-information of DP-SGD that exhibits scaling properties comparable with (Dwork et al, 2015)'s classic result for $ε$-differentially private algorithms, namely at most linear in the dataset size. From our result we obtain a general-purpose PAC-Bayes generalization bound in which the necessary prior distribution can be learned by DP-SGD, as well as a generalization bound for DP-SGD-trained models themselves, with a complexity term that is fully explicit and controlled by the optimization hyperparameters.

artificial intelligence, dp-sgd, machine learning, (15 more...)

arXiv.org Machine Learning

2605.26222

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback