AITopics | differential privacy

Collaborating Authors

differential privacy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning

Cesari, Tommaso, Colomboni, Roberto

arXiv.org Machine LearningMay-29-2026

We study stochastic decision-theoretic online learning with full information and event-level pure differential privacy. A COLT open problem of Hu and Mehta asks to determine the optimal gap-dependent regret rate for stochastic decision-theoretic online learning under pure event-level differential privacy. For $K$ actions, losses in $[0,1]$, and a unique best action separated from the second-best action by gap $Δ_{\min}$, the known lower bound is of order $ \frac{\log K}{\min\{Δ_{\min},\varepsilon\}}, $ or equivalently, up to universal constants, of order \[ \frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}. \] We give a horizon-free pure-DP algorithm and prove the explicit regret bound \[ \operatorname{Reg}_T \le 1000 \cdot \left(\frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}\right) \] for every horizon $T$. The numerical constant is not optimized. The algorithm partitions time into blocks of exponentially increasing size, plays a single action throughout each block, and chooses the next action by an exponential mechanism applied to a data-independent random prefix of the previous block. The random prefix converts block regret into a sum, over all prefix lengths, of softmax selection errors. A single entropy-potential argument controls all privacy-dominated large-gap actions at cost $\log K/\varepsilon$.

artificial intelligence, machine learning, privacy, (15 more...)

arXiv.org Machine Learning

2605.29148

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)

Add feedback

From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD

Lampert, Christoph H., Zakerinia, Hossein

arXiv.org Machine LearningMay-27-2026

Understanding the relationship between generalization and privacy remains a central challenge in modern machine learning theory, particularly for deep networks trained by variants of differentially private stochastic gradient descent (DP-SGD). In this work we make progress on this persistent open problem by proving a finite-sample bound on the approximate max-information of DP-SGD that exhibits scaling properties comparable with (Dwork et al, 2015)'s classic result for $ε$-differentially private algorithms, namely at most linear in the dataset size. From our result we obtain a general-purpose PAC-Bayes generalization bound in which the necessary prior distribution can be learned by DP-SGD, as well as a generalization bound for DP-SGD-trained models themselves, with a complexity term that is fully explicit and controlled by the optimization hyperparameters.

artificial intelligence, dp-sgd, machine learning, (15 more...)

arXiv.org Machine Learning

2605.26222

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Modulated learning for private and distributed regression with just a single sample per client device

Vepakomma, Praneeth, Reisizadeh, Amirhossein, Horváth, Samuel, Dahleh, Munther A.

arXiv.org Machine LearningMay-26-2026

This work focuses on the question of learning from a large number of devices with each device holding only a single sample of data. Several real-world applications exist to this one sample per client setup up including learning from fitness trackers, data/app usage aggregators, body-worn sensing devices, and daily event monitors to name a few. When a client has only one sample, the standard federated learning paradigm breaks down as a local update based on that single point is far from being useful, especially in the earlier rounds for estimation of the model coefficients. This utility is further weakened by the privacy-inducing noise applied at every round. This work caters to this problem to enable such clients to collaboratively contribute to effectively learn a global model without leaking the privacy of their data. The proposed approach injects a single, carefully calibrated noisy perturbation to transform the sample at each client, followed by a post-processed representation which is shared with the server. These representations aggregated at the server are processed to obtain an unbiased gradient update that in expectation matches the non-private centralized gradient while preserving data privacy. This approach is different than traditional private federated learning, where the communication payloads involve model coefficients as opposed to privately transformed data samples. This method enables devices with extremely limited data to collaborate and learn accurate, privacy-preserving models without requiring large local datasets or sacrificing individual privacy.

artificial intelligence, machine learning, privacy, (18 more...)

arXiv.org Machine Learning

2605.07233

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Private Adaptive Covariance Estimation via Gaussian Graphical Models

Ferrando, Cecilia, Fuentes, Miguel, Mullins, Brett, Musco, Cameron, Sheldon, Daniel

arXiv.org Machine LearningMay-26-2026

We propose PACE-GGM, a data-adaptive differentially private method for covariance estimation that concentrates its privacy budget on the most informative entries of the empirical covariance matrix, rather than perturbing all entries. This applies in the natural setting where the modeler supplies separate bounds for each variable, so that individual entries can be measured with less noise than the full matrix. In each round, our method selects a poorly approximated entry, measures it using the Gaussian mechanism, and then reconstructs a full covariance matrix using a maximum-entropy reconstruction objective, leading to a Gaussian graphical model structure. Experiments on diverse real-world datasets demonstrate consistent improvements in estimation error with respect to the Gaussian mechanism and other baselines, particularly in high-dimensional and low-to-moderate privacy regimes.

artificial intelligence, machine learning, matrix, (15 more...)

arXiv.org Machine Learning

2605.24295

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy

Mustafi, Aratrika, Mukherjee, Soumya

arXiv.org Machine LearningMay-25-2026

We consider the problem of sampling from an unnormalized Boltzmann/ Gibbs density, π(θ) exp V(θ),θ Θ Rd, where the normalization constant is unknown (and/or intractable) and only the potential function V (and typically its derivatives) can be evaluated. This problem arises across various domains in Bayesian inference, statistical physics, and modern machine learning. A common variational perspective on sampling is to characterize the target distribution π as the unique minimizer of a functional (typically a divergence functional) over the space of probability measures. From this viewpoint, sampling can be formulated as evolving an initial distribution ρ0 toward π via the gradient flow of this functional under a suitable geometric structure on the space of probability measures. In this paper, we focus on a gradient flow based sampling methodology built from the spherical Hellinger Kantorovich (SHK), also known as the Wasserstein Fisher Rao (WFR), geometry on the space of probability measures (Kondratyev and Vorotnikov, 2019; Liero et al., 2018; Chizat et al., 2015). When the variational objective is the exclusive KL divergence ρ 7 KL(ρ π), the SHK gradient flow generates a time-indexed family of marginals {ρt}t 0 (initialized at ρ0 P2(Θ)) that evolves according to the continuity reaction equation (4). This evolution is equivalent to the birth-death Langevin dynamics introduced in Lu et al. (2019) .

artificial intelligence, exp, machine learning, (20 more...)

arXiv.org Machine Learning

2605.23879

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

Marchis, Laurentiu, D'souza, Ethan, Flídr, Tomáš, Loh, Po-Ling

arXiv.org Machine LearningMay-18-2026

We study the classical problem of community recovery in stochastic block models with a fixed number of communities, with a twist: We seek algorithms that are stable with respect to node-wise changes in the graph structure, formally defined as a differential privacy constraint. The algorithms we develop are based on spectral clustering, where we introduce privacy to the community recovery pipeline in the form of directly privatizing the adjacency matrix; private PCA; private convex optimization; private low-rank matrix estimation; and private approximate subspace estimation. Straightforward applications of existing private algorithms lead to a rapid increase in the privacy parameter $ε$ in order to ensure consistent estimation under node differential privacy, in contrast with the simpler setting of edge privacy. To alleviate these issues, we develop novel algorithms based on (1) sampling from an exponential mechanism with a Lipschitz extension and (2) a general framework for constructing smooth projections from the space of undirected graphs to the space of bounded-degree graphs, which can then be combined with various edge-private algorithms. Importantly, the methods we develop are all computable in polynomial-time as a function of the number of nodes in the graph. We also develop novel lower bounds on the growth rate of $ε$ required in order to achieve consistent community estimation under node privacy. On a technical note, our paper highlights the complications that arise when analyzing private algorithms under the non-standard scaling $ε\rightarrow \infty$ and proposes some solutions. We also provide a novel application of the HGR maximal correlation from information theory in the context of accuracy amplification in PAC learning, which may be of independent interest.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Machine Learning

2605.15943

Genre:

Overview (0.65)
Research Report (0.63)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Security & Privacy (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

The Privacy Price of Tail-Risk Learning: Effective Tail Sample Size in Differentially Private CVaR Optimization

Mansouri, El Mustapha

arXiv.org Machine LearningMay-18-2026

Differential privacy changes the effective sample size governing CVaR learning. For tail mass $τ$, the privacy-relevant sample size is not $n$, but $nτ$; equivalently, the effective private tail sample size is $εnτ$. Private CVaR excess risk decomposes into ordinary tail-risk statistical error and a privacy price. This decomposition is complete for scalar estimation and finite classes: scalar estimation has rate $Θ(B \min\{1,(nτ)^{-1/2}+(εnτ)^{-1}\})$, and finite classes of size $M$ have rate $Θ(B \min\{1,\sqrt{\log(2M)/(nτ)}+\log(2M)/(εnτ)\})$. These complete rates hold under pure DP, and their lower bounds extend to approximate DP in the stated small-$δ$ regimes. For convex Lipschitz learning, modular upper and lower reductions show that the CVaR-specific privacy term necessarily scales as $1/(εnτ)$, with dimension dependence inherited from private stochastic convex optimization. Together, these results identify ordinary private learning on $Θ(nτ)$ informative tail records as the canonical hard subproblem inside private CVaR learning.

artificial intelligence, cvar, machine learning, (15 more...)

arXiv.org Machine Learning

2605.16219

Country: Asia > Japan > Honshū > Kantō (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Differentially Private Sampling from Distributions via Wasserstein Projection

Takakura, Shokichi, Liew, Seng Pei, Hasegawa, Satoshi

arXiv.org Machine LearningMay-12-2026

In this paper, we study the problem of sampling from a distribution under the constraint of differential privacy (DP). Prior works measure the utility of DP sampling with density ratio-based measures such as KL divergence. However, such formulations suffer from two key limitations: 1) they fail to capture the geometric structure of the support, and 2) they are not applicable when the supports of the distributions differ. To deal with these issues, we develop a novel framework for DP sampling with Wasserstein distance as the utility measure. In this formulation, we propose Wasserstein Projection Mechanism (WPM), a minimax optimal mechanism based on Wasserstein projection. Furthermore, we develop efficient algorithms for computing the proposed mechanisms approximately and provide convergence guarantees.

machine learning, mechanism, natural language, (17 more...)

arXiv.org Machine Learning

2605.10015

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM

Wang, Tianyu, Zhang, Luhao, Cummings, Rachel

arXiv.org Machine LearningMay-6-2026

Standard differential privacy imposes uniform privacy constraints across all features, overlooking the inherent distinction between sensitive and insensitive features in practice. In this paper, we introduce a relaxed definition of differential privacy that accounts for such heterogeneity, allowing certain features to be treated as insensitive even when correlated with sensitive ones. We propose a correlation-aware framework, $\textsf{CorrDP}$, which relaxes privacy for insensitive features while accounting for their correlations with sensitive features, with the correlations quantified using total variation distance. We design algorithms for differentially private empirical risk minimization (DP-ERM) under the $\textsf{CorrDP}$ framework, incorporating distance-dependent noise into gradients for improved theoretical utility guarantees. When the correlation distance is unknown, we estimate it from the dataset and show that it achieves a comparable privacy-utility guarantee. We perform experiments on synthetic and real-world datasets and show that $\textsf{CorrDP}$-based DP-ERM algorithms consistently outperform the standard DP framework in the presence of insensitive features.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2605.03945

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Filters

Collaborating Authors

differential privacy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning

From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD

Modulated learning for private and distributed regression with just a single sample per client device

Private Adaptive Covariance Estimation via Gaussian Graphical Models

On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

The Privacy Price of Tail-Risk Learning: Effective Tail Sample Size in Differentially Private CVaR Optimization

Differentially Private Sampling from Distributions via Wasserstein Projection

Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM

0172d289da48c48de8c5ebf3de9f7ee1-Paper.pdf