Feature-Learning Networks Are Consistent Across Widths At Realistic Scales

Neural Information Processing Systems 

Early in training, wide neural networks trained on online data have not only identical loss curves but also agree in their point-wise test predictions throughout training.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found