A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent
–Neural Information Processing Systems
We study the generalization error of randomized learning algorithms--focusing on stochastic gradient descent (SGD)--using a novel combination of PAC-Bayes and algorithmic stability. Importantly, our generalization bounds hold for all posterior distributions on an algorithm's random hyperparameters, including distributions that depend on the training data.
Neural Information Processing Systems
May-28-2025, 02:32:31 GMT