Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
Simon S. Du, Wei Hu, Jason D. Lee
–Neural Information Processing Systems
We study the implicit regularization imposed by gradient descent for learning multi-layer homogeneous functions including feed-forward fully connected and convolutional deep neural networks with linear, ReLU or Leaky ReLU activation. We rigorously prove that gradient flow (i.e.
Neural Information Processing Systems
Mar-27-2025, 05:46:32 GMT