Many thanks to the reviewers for their deep, thoughtful reviews and constructive suggestions
–Neural Information Processing Systems
We note that despite very recent observations on empirical superiority of adaptive synchronization (e.g., Surely, it would be interesting to see if our bound can be tightened. R1. log T communication rounds clarification: However, for local SGD with periodic averaging the proof techniques are more involved. We do not tune the learning rate.
Neural Information Processing Systems
Aug-20-2025, 01:13:15 GMT