Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent
Amid, Ehsan, Warmuth, Manfred K.
Continuous-time mirror descent (CMD) can be seen as the limit case of the discrete-time MD update when the step-size is infinitesimally small. In this paper, we focus on the geometry of the primal and dual CMD updates and introduce a general framework for reparameterizing one CMD update as another. Specifically, the reparameterized update also corresponds to a CMD, but on the composite loss w.r.t. the new variables, and the original variables are obtained via the reparameterization map. We employ these results to introduce a new family of reparameterizations that interpolate between the two commonly used updates, namely the continuous-time gradient descent (GD) and unnormalized exponentiated gradient (EGU), while extending to many other well-known updates. In particular, we show that for the underdetermined linear regression problem, these updates generalize the known behavior of GD and EGU, and provably converge to the minimum $\mathrm{L}_{2-\tau}$-norm solution for $\tau\in[0,1]$. Our new results also have implications for the regularized training of neural networks to induce sparsity.
Feb-24-2020
- Country:
- Europe > Italy (0.04)
- North America > United States
- California > Santa Clara County
- Mountain View (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York (0.04)
- California > Santa Clara County
- Genre:
- Research Report > New Finding (0.34)
- Technology: