On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

Lénaïc Chizat, Francis Bach

Neural Information Processing Systems 

This is an idealization of the usual way to train neural networks with a large hidden layer.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found