Distributed Gradient Descent with Many Local Steps in Overparameterized Models

Open in new window