Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks

Neural Information Processing Systems 

In contrast to convex optimization setting where the behavior of SGD is fairly well-understood (see e.g.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found