AITopics

2503.22244

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Hennick, Max, De Baerdemacker, Stijn

Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent

arXiv.org Artificial IntelligenceMar-28-2025

We show that the behavior of stochastic gradient descent is related to Bayesian statistics by showing that SGD is effectively diffusion on a fractal landscape, where the fractal dimension can be accounted for in a purely Bayesian way. By doing this we show that SGD can be regarded as a modified Bayesian sampler which accounts for accessibility constraints induced by the fractal structure of the loss landscape. We verify our results experimentally by examining the diffusion of weights during training. These results offer insight into the factors which determine the learning process, and seemingly answer the question of how SGD and purely Bayesian sampling are related.

artificial intelligence, dimension, machine learning, (16 more...)

2503.22478

Country:

North America > Canada > New Brunswick > Fredericton (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Ranaweera, Kanishka, Nguyen, Dinh C., Pathirana, Pubudu N., Smith, David, Ding, Ming, Rakotoarivelo, Thierry, Seneviratne, Aruna

Federated Learning with Differential Privacy: An Utility-Enhanced Approach

Abstract--Federated learning has emerged as an attractive approach to protect data privacy by eliminating the need for sharing clients' data while reducing communication costs compared with centralized machine learning algorithms. However, recent studies have shown that federated learning alone does not guarantee privacy, as private data may still be inferred from the uploaded parameters to the central server. In order to successfully avoid data leakage, adopting differential privacy (DP) in the local optimization process or in the local update aggregation process has emerged as two feasible ways for achieving sample-level or user-level privacy guarantees respectively, in federated learning models. However, compared to their non-private equivalents, these approaches suffer from a poor utility . T o improve the privacy-utility trade-off, we present a modification to these vanilla differentially private algorithms based on a Haar wavelet transformation step and a novel noise injection scheme that significantly lowers the asymptotic bound of the noise variance. We also present a holistic convergence analysis of our proposed algorithm, showing that our method yields better convergence performance than the vanilla DP algorithms. Numerical experiments on real-world datasets demonstrate that our method outperforms existing approaches in model utility while maintaining the same privacy guarantees. Machine learning (ML) has become an essential tool to analyze this data and extract valuable insights for various applications, including facial recognition, data analytics, weather prediction, and speech recognition, among others [1], [2], [3], [4], [5]. However, in real-world settings, data -- particularly personal data -- is often created and stored on end-user devices. The majority of traditional ML algorithms require the centralization of these training data, which involves collecting and processing data at a potent cloud-based server [6], [7]. This process carries significant risks to data integrity and privacy, particularly when it comes to personal data. Kanishka Ranaweera is with School of Engineering and Built Environment, Deakin University, Waurn Ponds, VIC 3216, Australia, and also with the Data61, CSIRO, Eveleigh, NSW 2015, Australia. Dinh C. Nguyen is with the Department of Electrical and Computer Engineering, The University of Alabama in Huntsville Alabama, USA. Pubudu N. Pathirana is with School of Engineering and Built Environment, Deakin University, Waurn Ponds, VIC 3216, Australia.

artificial intelligence, coefficient, machine learning, (19 more...)

2503.21154

Country:

North America > United States > Alabama > Madison County > Huntsville (0.24)
Oceania > Australia > New South Wales (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

AdvSGM: Differentially Private Graph Learning via Adversarial Skip-gram Model

Zhang, Sen, Ye, Qingqing, Hu, Haibo, Xu, Jianliang

--The skip-gram model (SGM), which employs a neural network to generate node vectors, serves as the basis for numerous popular graph embedding techniques. However, since the training datasets contain sensitive linkage information, the parameters of a released SGM may encode private information and pose significant privacy risks. Differential privacy (DP) is a rigorous standard for protecting individual privacy in data analysis. Nevertheless, when applying differential privacy to skip-gram in graphs, it becomes highly challenging due to the complex link relationships, which potentially result in high sensitivity and necessitate substantial noise injection. T o tackle this challenge, we present AdvSGM, a differentially private skip-gram for graphs via adversarial training. Our core idea is to leverage adversarial training to privatize skip-gram while improving its utility. T owards this end, we develop a novel adversarial training module by devising two optimizable noise terms that correspond to the parameters of a skip-gram. By fine-tuning the weights between modules within AdvSGM, we can achieve differentially private gradient updates without additional noise injection. Extensive experimental results on six real-world graph datasets show that AdvSGM preserves high data utility across different downstream tasks. Graph embedding, which has attracted increasing research attention, represents nodes by low-dimensional vectors while preserving the inherent properties and structures of the graph. In this way, well-studied machine learning algorithms can be easily applied for further mining tasks like clustering, classification, and prediction. Skip-gram models (SGMs) are a popular class of graph embedding models thanks to their simplicity and effectiveness, including DeepWalk [1], LINE [2], and node2vec [3]. However, SGMs, which capture not only general data characteristics but also specific details about individual data, are vulnerable to adversarial attacks, particularly user-linkage attacks [4] that exploit the linkage information between nodes to infer whether an individual is present in the training dataset. Therefore, node embeddings need to be sanitized with privacy guarantees before they can be released to the public. Differential privacy (DP) [5] is a well studied statistical privacy model recognized for its rigorous mathematical underpinnings. In this paper, we study the problem of achieving privacy-preserving skip-gram for graphs under differential privacy.

artificial intelligence, differential privacy, machine learning, (16 more...)

2503.21426

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Erden, Zeki Doruk, Faltings, Boi

A Proposal for Networks Capable of Continual Learning

We analyze the ability of computational units to retain past responses after parameter updates, a key property for system-wide continual learning. Neural networks trained with gradient descent lack this capability, prompting us to propose Modelleyen, an alternative approach with inherent response preservation. We demonstrate through experiments on modeling the dynamics of a simple environment and on MNIST that, despite increased computational complexity and some representational limitations at its current stage, Modelleyen achieves continual learning without relying on sample replay or predefined task boundaries.

artificial intelligence, csv, machine learning, (18 more...)

2503.22068

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Ranaweera, Kanishka, Smith, David, Pathirana, Pubudu N., Ding, Ming, Rakotoarivelo, Thierry, Seneviratne, Aruna

Multi-Objective Optimization for Privacy-Utility Balance in Differentially Private Federated Learning

--Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, making it a promising approach for privacy-preserving machine learning. However, ensuring differential privacy (DP) in FL presents challenges due to the trade-off between model utility and privacy protection. Clipping gradients before aggregation is a common strategy to limit privacy loss, but selecting an optimal clipping norm is non-trivial, as excessively high values compromise privacy, while overly restrictive clipping degrades model performance. In this work, we propose an adaptive clipping mechanism that dynamically adjusts the clipping norm using a multi-objective optimization framework. We theoretically analyze the convergence properties of our method and demonstrate its effectiveness through extensive experiments on MNIST, Fashion-MNIST, and CIF AR-10 datasets. Our results show that adaptive clipping consistently outperforms fixed-clipping baselines, achieving improved accuracy under the same privacy constraints. This work highlights the potential of dynamic clipping strategies to enhance privacy-utility trade-offs in differentially private federated learning. Federated Learning (FL) has emerged as a transformative paradigm for collaborative training of machine learning models without centralized data aggregation [1], [2]. Kanishka Ranaweera is with School of Engineering and Built Environment, Deakin University, Waurn Ponds, VIC 3216, Australia, and also with the Data61, CSIRO, Eveleigh, NSW 2015, Australia. David Smith is with Data61, CSIRO, Eveleigh, NSW 2015, Australia.

artificial intelligence, machine learning, optimization problem, (17 more...)

2503.21159

Country: Oceania > Australia > New South Wales (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Lewis, Laura, Gilboa, Dar, McClean, Jarrod R.

Quantum advantage for learning shallow neural networks with natural data distributions

arXiv.org Artificial IntelligenceMar-26-2025

The application of quantum computers to machine learning tasks is an exciting potential direction to explore in search of quantum advantage. In the absence of large quantum computers to empirically evaluate performance, theoretical frameworks such as the quantum probably approximately correct (PAC) and quantum statistical query (QSQ) models have been proposed to study quantum algorithms for learning classical functions. Despite numerous works investigating quantum advantage in these models, we nevertheless only understand it at two extremes: either exponential quantum advantages for uniform input distributions or no advantage for potentially adversarial distributions. In this work, we study the gap between these two regimes by designing an efficient quantum algorithm for learning periodic neurons in the QSQ model over a broad range of non-uniform distributions, which includes Gaussian, generalized Gaussian, and logistic distributions. To our knowledge, our work is also the first result in quantum learning theory for classical functions that explicitly considers real-valued functions. Recent advances in classical learning theory prove that learning periodic neurons is hard for any classical gradient-based algorithm, giving us an exponential quantum advantage over such algorithms, which are the standard workhorses of machine learning. Moreover, in some parameter regimes, the problem remains hard for classical statistical query algorithms and even general classical algorithms learning under small amounts of noise.

artificial intelligence, equation, machine learning, (18 more...)

2503.20879

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
North America > United States (0.27)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

arXiv.org Machine LearningMar-25-2025

Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent

Doldo, Philip, Everett, Derek, Khanna, Amol, Nguyen, Andre T, Raff, Edward

Projected Gradient Descent (PGD) under the $L_\infty$ ball has become one of the defacto methods used in adversarial robustness evaluation for computer vision (CV) due to its reliability and efficacy, making a strong and easy-to-implement iterative baseline. However, PGD is computationally demanding to apply, especially when using thousands of iterations is the current best-practice recommendation to generate an adversarial example for a single image. In this work, we introduce a simple novel method for early termination of PGD based on cycle detection by exploiting the geometry of how PGD is implemented in practice and show that it can produce large speedup factors while providing the \emph{exact} same estimate of model robustness as standard PGD. This method substantially speeds up PGD without sacrificing any attack strength, enabling evaluations of robustness that were previously computationally intractable.

artificial intelligence, iteration, machine learning, (18 more...)

arXiv.org Machine Learning

2503.19347

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningMar-25-2025

Interpretable Deep Regression Models with Interval-Censored Failure Time Data

Yuan, Changhui, Zhao, Shishun, Li, Shuwei, Song, Xinyuan, Chen, Zhao

Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing model capabilities, especially in exploring nonlinear covariate effects under right censoring. However, deep learning methods for interval-censored data, where the unobservable failure time is only known to lie in an interval, remain underexplored and limited to specific data type or model. This work proposes a general regression framework for interval-censored data with a broad class of partially linear transformation models, where key covariate effects are modeled parametrically while nonlinear effects of nuisance multi-modal covariates are approximated via DNNs, balancing interpretability and flexibility. We employ sieve maximum likelihood estimation by leveraging monotone splines to approximate the cumulative baseline hazard function. To ensure reliable and tractable estimation, we develop an EM algorithm incorporating stochastic gradient descent. We establish the asymptotic properties of parameter estimators and show that the DNN estimator achieves minimax-optimal convergence. Extensive simulations demonstrate superior estimation and prediction accuracy over state-of-the-art methods. Applying our method to the Alzheimer's Disease Neuroimaging Initiative dataset yields novel insights and improved predictive performance compared to traditional approaches.

artificial intelligence, interpretable deep regression model, machine learning, (3 more...)

arXiv.org Machine Learning

2503.19763

Genre: Research Report (0.69)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.53)

arXiv.org Artificial IntelligenceMar-23-2025

Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

Zhang, Ruijia, Lei, Mingxi, Ding, Meng, Xiang, Zihang, Xu, Jinhui, Wang, Di

In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $\tilde{O}(\frac{d^{1/4}}{({n\epsilon})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $\tilde{O}(\frac{d^{1/3}}{(n\epsilon)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.

artificial intelligence, machine learning, reinforcement learning, (2 more...)

2503.18317

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)