Making SGD Parameter-Free

Carmon, Yair, Hinder, Oliver

arXiv.org Artificial Intelligence 

Stochastic convex optimization (SCO) is a cornerstone of both the theory and practice of machine learning. Consequently, there is intense interest in developing SCO algorithms that require little to no prior knowledge of the problem parameters, and hence little to no tuning [27, 23, 20, 2, 22, 39]. In this work we consider the fundamental problem of non-smooth SCO (in a potentially unbounded domain) and seek methods that are adaptive to a key problem parameter: the initial distance to optimality. Current approaches for tackling this problem focus on the more general online learning problem of parameter-free regret minimization [8, 10, 11, 12, 21, 24, 25, 30, 32, 37], where the goal is to to obtain regret guarantees that are valid for comparators with arbitrary norms. Research on parameter-free regret minimization has lead to practical algorithms for stochastic optimization [9, 27, 32], methods that are able to adapt to many problem parameters simultaneously [37] and methods that can work with any norm [12].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found