A Bayesian Perspective on Training Speed and Model Selection

Oct-10-2024, 13:46:04 GMT–Neural Information Processing Systems

We take a Bayesian perspective to illustrate a connection between training speed and the marginal likelihood in linear models. This provides two major insights: first, that a measure of a model's training speed can be used to estimate its marginal likelihood. Second, that this measure, under certain conditions, predicts the relative weighting of models in linear model combinations trained to minimize a regression loss. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. We further provide encouraging empirical evidence that the intuition developed in these settings also holds for deep neural networks trained with stochastic gradient descent.

bayesian perspective, neural network, training speed and model selection, (3 more...)

Neural Information Processing Systems

Oct-10-2024, 13:46:04 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report > New Finding (0.52)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.79)