adaptive learning rate
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
Momentum-Based Variance Reduction in Non-Convex SGD
Ashok Cutkosky, Francesco Orabona
Variance reduction has emerged in recent years as a strong competitor to stochastic gradient descent in non-convex problems, providing the first algorithms to improve upon the converge rate of stochastic gradient descent for finding first-order critical points. However, variance reduction techniques typically require carefully tuned learning rates and willingness to use excessively large "mega-batches" in order to achieve their improved results.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Scale-Invariant Fast Convergence in Games
Tsuchiya, Taira, Luo, Haipeng, Ito, Shinji
Scale-invariance in games has recently emerged as a widely valued desirable property. Yet, almost all fast convergence guarantees in learning in games require prior knowledge of the utility scale. To address this, we develop learning dynamics that achieve fast convergence while being both scale-free, requiring no prior information about utilities, and scale-invariant, remaining unchanged under positive rescaling of utilities. For two-player zero-sum games, we obtain scale-free and scale-invariant dynamics with external regret bounded by $\tilde{O}(A_{\mathrm{diff}})$, where $A_{\mathrm{diff}}$ is the payoff range, which implies an $\tilde{O}(A_{\mathrm{diff}} / T)$ convergence rate to Nash equilibrium after $T$ rounds. For multiplayer general-sum games with $n$ players and $m$ actions, we obtain scale-free and scale-invariant dynamics with swap regret bounded by $O(U_{\mathrm{max}} \log T)$, where $U_{\mathrm{max}}$ is the range of the utilities, ignoring the dependence on the number of players and actions. This yields an $O(U_{\mathrm{max}} \log T / T)$ convergence rate to correlated equilibrium. Our learning dynamics are based on optimistic follow-the-regularized-leader with an adaptive learning rate that incorporates the squared path length of the opponents' gradient vectors, together with a new stopping-time analysis that exploits negative terms in regret bounds without scale-dependent tuning. For general-sum games, scale-free learning is enabled also by a technique called doubling clipping, which clips observed gradients based on past observations.
- North America > United States > California (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.14)
- North America > Canada (0.04)
- North America > United States (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)