Unveiling the Power of Multiple Gossip Steps: AStability-Based Generalization Analysis in Decentralized Training
–Neural Information Processing Systems
Decentralized training removes the centralized server, making it a communicationefficient approach that can significantly improve training efficiency, but it often suffers from degraded performance compared to centralized training. Multi-Gossip Steps (MGS) serve as a simple yet effective bridge between decentralized and centralized training, significantly reducing experiment performance gaps. However, the theoretical reasons for its effectiveness and whether this gap can be fully eliminated by MGS remain open questions. In this paper, we derive upper bounds on the generalization error and excess error of MGS using stability analysis, systematically answering these two key questions.
Neural Information Processing Systems
Jun-18-2026, 03:32:31 GMT