Interpreting a Penalty as the Influence of a Bayesian Prior

Wolinski, Pierre, Charpiat, Guillaume, Ollivier, Yann

Feb-1-2020–arXiv.org Machine Learning

For instance, penalties are used to improve generalization, prune neurons or reduce the rank of tensors of weights. Therefore, usual penalties are mostly empirical and user-defined, and integrated to the loss as follows: L( w) null( w) r (w), with w the vector of all parameters in the network, null( w) the error term and r (w) the penalty term. From a Bayesian point of view, optimizing such a loss L is equivalent to finding the Maximum A Posteriori (MAP) of the parameters w given the training data and a prior α exp( r). Indeed, assuming that the loss null is a log-likelihood loss, namely, null(w) ln p w( D) with dataset D, then minimizing L is equivalent to minimizing L MAP(w) ln p w(D) ln(α (w)). Thus, within the MAP framework, we can interpret the penalty term r as the influence of a prior α [14]. However, the MAP approximates the Bayesian posterior very roughly, by taking its maximum. Variational Inference (VI) provides a variational posterior distribution rather than a single value, hopefully representing the Bayesian posterior much better. VI looks for the best posterior approximation within a family β u(w) of approximate posteriors over w, parameterized Inria, Team TAU, Gif-sur-Yvette, France † Facebook, France 1 arXiv:2002.00178v1

neural network, neuron, penalty, (17 more...)

arXiv.org Machine Learning

Feb-1-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Arizona > Maricopa County > Phoenix (0.04)
- Europe
  - France (0.44)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.68)
  - Machine Learning
    - Neural Networks (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found