Higher Order Generalization Error for First Order Discretization of Langevin Diffusion

Feb-11-2021–arXiv.org Machine Learning

We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD). For an $\epsilon$ tolerance of expected generalization error, it is known that a first order discretization can reach this target if we run $\Omega(\epsilon^{-1} \log (\epsilon^{-1}) )$ iterations with $\Omega(\epsilon^{-1})$ samples. In this article, we show that with additional smoothness assumptions, even first order methods can achieve arbitrarily runtime complexity. More precisely, for each $N>0$, we provide a sufficient smoothness condition on the loss function such that a first order discretization can reach $\epsilon$ expected generalization error given $\Omega( \epsilon^{-1/N} \log (\epsilon^{-1}) )$ iterations with $\Omega(\epsilon^{-1})$ samples.

artificial intelligence, evolutionary algorithm, null, (17 more...)

arXiv.org Machine Learning

Feb-11-2021

arXiv.org PDF

Add feedback

Country:
- Europe > Netherlands (0.14)
- North America
  - Canada > Ontario
    - Toronto (0.14)
  - United States (0.14)

Genre:
- Research Report (0.83)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Statistical Learning
      - Gradient Descent (0.34)
    - Representation & Reasoning > Mathematical & Statistical Methods (0.48)
  - Mathematics of Computing (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found