Convergence to minima for the continuous version of Backtracking Gradient Descent

Nov-13-2019–arXiv.org Machine Learning

The main result of this paper is: {\bf Theorem.} Let $f:\mathbb{R}^k\rightarrow \mathbb{R}$ be a $C^{1}$ function, so that $\nabla f$ is locally Lipschitz continuous. Assume moreover that $f$ is $C^2$ near its generalised saddle points. Fix real numbers $\delta_0>0$ and $0<\alpha <1$. Then there is a smooth function $h:\mathbb{R}^k\rightarrow (0,\delta_0]$ so that the map $H:\mathbb{R}^k\rightarrow \mathbb{R}^k$ defined by $H(x)=x-h(x)\nabla f(x)$ has the following property: (i) For all $x\in \mathbb{R}^k$, we have $f(H(x)))-f(x)\leq -\alpha h(x)||\nabla f(x)||^2$. (ii) For every $x_0\in \mathbb{R}^k$, the sequence $x_{n+1}=H(x_n)$ either satisfies $\lim_{n\rightarrow\infty}||x_{n+1}-x_n||=0$ or $ \lim_{n\rightarrow\infty}||x_n||=\infty$. Each cluster point of $\{x_n\}$ is a critical point of $f$. If moreover $f$ has at most countably many critical points, then $\{x_n\}$ either converges to a critical point of $f$ or $\lim_{n\rightarrow\infty}||x_n||=\infty$. (iii) There is a set $\mathcal{E}_1\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_1$, the sequence $x_{n+1}=H(x_n)$, {\bf if converges}, cannot converge to a {\bf generalised} saddle point. (iv) There is a set $\mathcal{E}_2\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_2$, any cluster point of the sequence $x_{n+1}=H(x_n)$ is not a saddle point, and more generally cannot be an isolated generalised saddle point. Some other results are proven.

critical point, generalised saddle point, saddle point, (15 more...)

arXiv.org Machine Learning

Nov-13-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Massachusetts > Middlesex County
    - Belmont (0.04)
- Europe
  - Germany (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Norway > Eastern Norway
    - Oslo (0.04)
- Asia > Middle East
  - UAE (0.04)
  - Jordan (0.04)

Genre:
- Research Report (0.50)
- Overview (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.46)
  - Machine Learning
    - Neural Networks (0.48)
    - Statistical Learning > Gradient Descent (0.41)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found