Bayesian Learning via Stochastic Dynamics
–Neural Information Processing Systems
The attempt to find a single "optimal" weight vector in conventional networktraining can lead to overfitting and poor generalization. Bayesian methods avoid this, without the need for a validation set, by averaging the outputs of many networks with weights sampled from the posterior distribution given the training data. This sample can be obtained by simulating a stochastic dynamical system that has the posterior as its stationary distribution.
Neural Information Processing Systems
Dec-31-1993