Reviews: Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes

Neural Information Processing Systems 

This paper studies gradient descent with a fixed step size for 2x2 zero-sum games. The key theoretical result is that while in discrete time, FTRL only guarantees linear regret for a fixed step size, by utilizing the geometry in gradient descent, one may obtain sublinear regret in 2x2 zero-sum games. The authors show that after a long enough time, strategies will lie on the boundary of the strategy space. The main result (Theorem 2) is obtained by characterizing the exact dynamics for gradient descent, suitably selecting a dual space, decomposing each "rotation" into distinct components, analyzing the change in energy in each iteration for each component and showing that time is quadratic in the number of partitions (rotations). The authors give an example which satisfies a matching lower bound, showing that their analysis is tight.