Hamiltonian Monte Carlo explained by Alex Rogozhnikov
MCMC (Markov chain Monte Carlo) is a family of methods that are applied in computational physics and chemistry and also widely used in bayesian machine learning. It is used to simulate physical systems with Gibbs canonical distribution: $$ p(\vx) \propto \exp\left( - \frac{U(\vx)}{T} \right) $$ Probability $ p(\vx) $ of a system to be in the state $ \vx $ depends on the energy of the state $U(\vx)$ and temperature $ T $ . This distribution describes positions and velocities of particles in the gas, for instance. In bayesian machine learning, it defines distribution of model parameters (such as weights of a neural network). For example, consider a multivariate normal distribution: $$ p(\vx) \propto \exp\left( - \dfrac{1}{2} (\vx - \mu) T \Sigma {-1} (\vx - \mu) \right) $$ which corresponds to the following potential energy: $$ U(\vx) \dfrac{1}{2} (\vx - \mu) T \Sigma {-1} (\vx - \mu), \qquad T 1. $$ Any distribution can be rewritten as Gibbs canonical distribution, but for many problems such energy-based distributions appear very naturally.
Mar-3-2018, 05:04:10 GMT