DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Khaled, Ahmed, Mishchenko, Konstantin, Jin, Chi
We focus on gradient descent and its variants, as they are widely adopted and scale well when the model dimensionality d is large (Bottou et al., 2018). The optimization problem (OPT) finds many applications: in solving linear systems, logistic regression, support vector machines, and other areas of machine learning (Boyd and Vandenberghe, 2004). Equally important, methods designed for (stochastic) convex optimization also influence the intuition for and design of methods for nonconvex optimization-for example, momentum (Polyak, 1964), AdaGrad (Duchi et al., 2010), and Adam (Kingma and Ba, 2015) were all first analyzed in the convex optimization framework. As models become larger and more complex, the cost and environmental impact of training have rapidly grown as well (Sharir et al., 2020; Patterson et al., 2021). Therefore, it is vital that we develop more efficient and effective methods of solving machine learning optimization tasks. One of the chief challenges in applying gradient-based methods is that they often require tuning one or more stepsize parameters (Goodfellow et al., 2016), and the choice of stepsize can significantly influence a method's convergence speed as well as the quality of the obtained solutions, especially in deep learning (Wilson et al., 2017). The cost and impact of hyperparameter tuning on the optimization process have led to significant research activity in designing parameter-free and adaptive optimization methods in recent years, see e.g.
Oct-29-2023
- Country:
- North America
- United States > California
- Los Angeles County > Long Beach (0.14)
- San Diego County > San Diego (0.04)
- Canada
- Quebec > Montreal (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States > California
- Europe
- Russia (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Greater London > London (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Asia
- Russia (0.04)
- Middle East > Israel
- Haifa District > Haifa (0.04)
- Japan > Kyūshū & Okinawa
- Okinawa (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America
- Genre:
- Research Report (0.91)
- Technology: