Second-order Conditional Gradients

Carderera, Alejandro, Pokutta, Sebastian

Feb-20-2020–arXiv.org Machine Learning

An immensely powerful approach when X R n is to construct a second-order approximation to f(x) at the current iterate using first and second order information, denoted by ˆf(x), and move in the direction that minimizes this approximation, giving rise to a family of methods known as Newton methods (Kantorovich, 1948). A damped variant of the former applied to the minimization of a self-concordant function, converges globally, and shows quadratic local convergence when the iterates are close enough to the optimum (Nesterov & Nemirovskii, 1994). The global convergence of this method also extends to strongly convex and smooth function (Nesterov & Nemirovskii, 1994; Nesterov, 2013). Using a cubic regularized version of Newton's method, the global convergence of the method can also be extended to a broader class of functions than that of self-concordant or strongly convex and smooth functions (Nesterov & Polyak, 2006). When X R n is a convex set, one can use a constrained analog of these methods (Levitin & Polyak, 1966), where a quadratic approximation to the function is minimized over X at each iteration.

algorithm, convergence, socg algorithm, (13 more...)

arXiv.org Machine Learning

Feb-20-2020

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- North America > United States
  - Massachusetts (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (0.93)
  - Representation & Reasoning
    - Optimization (1.00)
    - Mathematical & Statistical Methods (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found