kane
Learning
For additional motivation, it is reasonable to consider Massart noise to be a more realistic model of real-life noise (even when benign) when compared to the RCN model, as it allows for some amount of non-uniformity. This made Definition 1 a possibly tractable way to relax the noise assumption, without running intotheaforementioned computational barriers foragnostic learning.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Africa > Sudan (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (5 more...)
Underdamped Langevin MCMC with third order convergence
Scott, Maximilian, O'Kane, Dáire, Jelinčič, Andraž, Foster, James
In this paper, we propose a new numerical method for the underdamped Langevin diffusion (ULD) and present a non-asymptotic analysis of its sampling error in the 2-Wasserstein distance when the $d$-dimensional target distribution $p(x)\propto e^{-f(x)}$ is strongly log-concave and has varying degrees of smoothness. Precisely, under the assumptions that the gradient and Hessian of $f$ are Lipschitz continuous, our algorithm achieves a 2-Wasserstein error of $\varepsilon$ in $\mathcal{O}(\sqrt{d}/\varepsilon)$ and $\mathcal{O}(\sqrt{d}/\sqrt{\varepsilon})$ steps respectively. Therefore, our algorithm has a similar complexity as other popular Langevin MCMC algorithms under matching assumptions. However, if we additionally assume that the third derivative of $f$ is Lipschitz continuous, then our algorithm achieves a 2-Wasserstein error of $\varepsilon$ in $\mathcal{O}(\sqrt{d}/\varepsilon^{\frac{1}{3}})$ steps. To the best of our knowledge, this is the first gradient-only method for ULD with third order convergence. To support our theory, we perform Bayesian logistic regression across a range of real-world datasets, where our algorithm achieves competitive performance compared to an existing underdamped Langevin MCMC algorithm and the popular No U-Turn Sampler (NUTS).
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Somerset > Bath (0.04)
- Asia > Middle East > Jordan (0.04)