AITopics | dp-gd

Collaborating Authors

dp-gd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

acb3e20075b0a2dfa3565f06681578e5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 12:10:05 GMT

data mining, machine learning, trade-off function, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(17 more...)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Optimization, Generalization and Differential Privacy Bounds for Gradient Descent on Kolmogorov-Arnold Networks

Wang, Puyu, Zhou, Junyu, Liznerski, Philipp, Kloft, Marius

arXiv.org Machine LearningFeb-5-2026

Kolmogorov--Arnold Networks (KANs) have recently emerged as a structured alternative to standard MLPs, yet a principled theory for their training dynamics, generalization, and privacy properties remains limited. In this paper, we analyze gradient descent (GD) for training two-layer KANs and derive general bounds that characterize their training dynamics, generalization, and utility under differential privacy (DP). As a concrete instantiation, we specialize our analysis to logistic loss under an NTK-separable assumption, where we show that polylogarithmic network width suffices for GD to achieve an optimization rate of order $1/T$ and a generalization rate of order $1/n$, with $T$ denoting the number of GD iterations and $n$ the sample size. In the private setting, we characterize the noise required for $(ε,δ)$-DP and obtain a utility bound of order $\sqrt{d}/(nε)$ (with $d$ the input dimension), matching the classical lower bound for general convex Lipschitz problems. Our results imply that polylogarithmic width is not only sufficient but also necessary under differential privacy, revealing a qualitative gap between non-private (sufficiency only) and private (necessity also emerges) training regimes. Experiments further illustrate how these theoretical insights can guide practical choices, including network width selection and early stopping.

artificial intelligence, machine learning, theorem 4, (18 more...)

arXiv.org Machine Learning

2601.22409

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Unified Enhancement of Privacy Bounds for Mixture Mechanisms via f -Differential Privacy

Neural Information Processing SystemsDec-26-2025, 13:22:40 GMT

Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy bounds for shuffling models and one-iteration differentially private gradient descent (DP-GD) with random initializations using $f$-DP. We derive a closed-form expression of the trade-off function for shuffling models that outperforms the most up-to-date results based on $(\epsilon,\delta)$-DP.Moreover, we investigate the effects of random initialization on the privacy of one-iteration DP-GD. Our numerical computations of the trade-off function indicate that random initialization can enhance the privacy of DP-GD.Our analysis of $f$-DP guarantees for these mixture mechanisms relies on an inequality for trade-off functions introduced in this paper. This inequality implies the joint convexity of $F$-divergences. Finally, we study an $f$-DP analog of the advanced joint convexity of the hockey-stick divergence related to $(\epsilon,\delta)$-DP and apply it to analyze the privacy of mixture mechanisms.

mixture mechanism, random initialization, unified enhancement, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Understanding Generalization in DP-GD: A Case Study in Training Two-Layer CNNs

Shi, Zhongjie, Wang, Puyu, Zhang, Chenyang, Cao, Yuan

arXiv.org Machine LearningDec-1-2025

Modern deep learning techniques focus on extracting intricate information from data to achieve accurate predictions. However, the training datasets may be crowdsourced and include sensitive information, such as personal contact details, financial data, and medical records. As a result, there is a growing emphasis on developing privacy-preserving training algorithms for neural networks that maintain good performance while preserving privacy. In this paper, we investigate the generalization and privacy performances of the differentially private gradient descent (DP-GD) algorithm, which is a private variant of the gradient descent (GD) by incorporating additional noise into the gradients during each iteration. Moreover, we identify a concrete learning task where DP-GD can achieve superior generalization performance compared to GD in training two-layer Huberized ReLU convolutional neural networks (CNNs). Specifically, we demonstrate that, under mild conditions, a small signal-to-noise ratio can result in GD producing training models with poor test accuracy, whereas DP-GD can yield training models with good test accuracy and privacy guarantees if the signal-to-noise ratio is not too small. This indicates that DP-GD has the potential to enhance model performance while ensuring privacy protection in certain learning tasks. Numerical simulations are further conducted to support our theoretical results.

dp-gd, inequality, second inequality, (16 more...)

arXiv.org Machine Learning

2511.2227

Country:

Asia > China > Hong Kong (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (0.63)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Unified Enhancement of Privacy Bounds for Mixture Mechanisms via f-Differential Privacy

Neural Information Processing SystemsOct-9-2025, 04:33:16 GMT

Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling.

artificial intelligence, machine learning, trade-off function, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(17 more...)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

On the Performance of Differentially Private Optimization with Heavy-Tail Class Imbalance

Tang, Qiaoyue, Zhiyanov, Alain, Lécuyer, Mathias

arXiv.org Artificial IntelligenceJul-15-2025

In this work, we analyze the optimization behaviour of common private learning optimization algorithms under heavy-tail class imbalanced distribution. We show that, in a stylized model, optimizing with Gradient Descent with differential privacy (DP-GD) suffers when learning low-frequency classes, whereas optimization algorithms that estimate second-order information do not. In particular, DP-AdamBC that removes the DP bias from estimating loss curvature is a crucial component to avoid the ill-condition caused by heavy-tail class imbalance, and empirically fits the data better with $\approx8\%$ and $\approx5\%$ increase in training accuracy when learning the least frequent classes on both controlled experiments and real data respectively.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

2507.10536

Country: North America > Canada > British Columbia (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Unified Enhancement of Privacy Bounds for Mixture Mechanisms via f -Differential Privacy

Neural Information Processing SystemsJan-19-2025, 19:04:29 GMT

Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy bounds for shuffling models and one-iteration differentially private gradient descent (DP-GD) with random initializations using f -DP. We derive a closed-form expression of the trade-off function for shuffling models that outperforms the most up-to-date results based on (\epsilon,\delta) -DP.Moreover, we investigate the effects of random initialization on the privacy of one-iteration DP-GD. Our numerical computations of the trade-off function indicate that random initialization can enhance the privacy of DP-GD.Our analysis of f -DP guarantees for these mixture mechanisms relies on an inequality for trade-off functions introduced in this paper.

privacy, random initialization, unified enhancement, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

Murata, Tomoya, Suzuki, Taiji

arXiv.org Artificial IntelligenceJun-3-2023

Differential private optimization for nonconvex smooth objective is considered. In the previous work, the best known utility bound is $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ in terms of the squared full gradient norm, which is achieved by Differential Private Gradient Descent (DP-GD) as an instance, where $n$ is the sample size, $d$ is the problem dimensionality and $\varepsilon_\mathrm{DP}$ is the differential privacy parameter. To improve the best known utility bound, we propose a new differential private optimization framework called \emph{DIFF2 (DIFFerential private optimization via gradient DIFFerences)} that constructs a differential private global gradient estimator with possibly quite small variance based on communicated \emph{gradient differences} rather than gradients themselves. It is shown that DIFF2 with a gradient descent subroutine achieves the utility of $\widetilde O(d^{2/3}/(n\varepsilon_\mathrm{DP})^{4/3})$, which can be significantly better than the previous one in terms of the dependence on the sample size $n$. To the best of our knowledge, this is the first fundamental result to improve the standard utility $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ for nonconvex objectives. Additionally, a more computational and communication efficient subroutine is combined with DIFF2 and its theoretical analysis is also given. Numerical experiments are conducted to validate the superiority of DIFF2 framework.

artificial intelligence, differential private optimization, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2302.03884

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems

Song, Shuang, Thakkar, Om, Thakurta, Abhradeep

arXiv.org Machine LearningJun-11-2020

Differentially private gradient descent (DP-GD) has been extremely effective both theoretically, and in practice, for solving private empirical risk minimization (ERM) problems. In this paper, we focus on understanding the impact of the clipping norm, a critical component of DP-GD, on its convergence. We provide the first formal convergence analysis of clipped DP-GD. More generally, we show that the value which one sets for clipping really matters: done wrong, it can dramatically affect the resulting quality; done properly, it can eliminate the dependence of convergence on the model dimensionality. We do this by showing a dichotomous behavior of the clipping norm. First, we show that if the clipping norm is set smaller than the optimal, even by a constant factor, the excess empirical risk for convex ERMs can increase from $O(1/n)$ to $\Omega(1)$, where $n$ is the number of data samples. Next, we show that, regardless of the value of the clipping norm, clipped DP-GD minimizes a well-defined convex objective over an unconstrained space, as long as the underlying ERM is a generalized linear problem. Furthermore, if the clipping norm is set within at most a constant factor higher than the optimal, then one can obtain an excess empirical risk guarantee that is independent of the dimensionality of the model space. Finally, we extend our result to non-convex generalized linear problems by showing that DP-GD reaches a first-order stationary point as long as the loss is smooth, and the convergence is independent of the dimensionality of the model space.

artificial intelligence, huber, machine learning, (17 more...)

arXiv.org Machine Learning

2006.06783

Country:

North America > United States > New York (0.04)
North America > United States > New Mexico > Lea County (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)

Add feedback