Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Jan-19-2025, 05:01:56 GMT–Neural Information Processing Systems

Optimization is a key component for training machine learning models and has a strong impact on their generalization. In this paper, we consider a particular optimization method---the stochastic gradient Langevin dynamics (SGLD) algorithm---and investigate the generalization of models trained by SGLD. We derive a new generalization bound by connecting SGLD with Gaussian channels found in information and communication theory. Our bound can be computed from the training data and incorporates the variance of gradients for quantifying a particular kind of "sharpness" of the loss landscape. We also consider a closely related algorithm with SGLD, namely differentially private SGD (DP-SGD).

gaussian channel, generalization capability, sgld, (5 more...)

Neural Information Processing Systems

Jan-19-2025, 05:01:56 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)