On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

Open in new window