Everything that Works Works Because it's Bayesian: Why Deep Nets Generalize?

Jun-9-2017, 18:35:09 GMT–@machinelearnbot

We could not so far claim that deep networks trained with stochastic gradient descent are Bayesian. And it may be because SGD biases learning towards flat minima, rather than sharp minima. It turns out, (Hochreiter and Schmidhuber, 1997) motivated their work on seeking flat minima from a Bayesian, minimum description length perspective. Seeking flat minima makes sense from a minimum description length perspective.

Bayesian, Bayesian inference, machine learning, (13 more...)

@machinelearnbot

Jun-9-2017, 18:35:09 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.30)

Similar Docs Excel Report more

Title	Similarity	Source
None found