Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes

Neural Information Processing Systems 

Sharma et al. (2022) provide Y ang et al. (2022a) integrate Local SGDA with stochastic gradient estimators to eliminate the More recently, Zhang et al. (2023) adopt compressed momentum methods with Local SGD to increase the communication efficiency of the algorithm. For centralized nonconvex minimax problems, Y ang et al. (2022b) show that, even in deterministic settings, GDA-based methods necessitate the timescale separation of the stepsizes for primal and dual updates.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found