MinibatchvsLocalSGDfor HeterogeneousDistributedLearning

Neural Information Processing Systems 

Given the massive scale of many modern machine learning models and datasets, it has become important to develop better methods for distributed training.