Reviews: Implicit Bias of Gradient Descent on Linear Convolutional Networks
–Neural Information Processing Systems
The paper considers the problem of formalizing the implicit bias of gradient descent on fully connected linear/convolutional networks with an exponential loss. Building on the recent work by Soudry et al. which considered a one layer neural network with no activation the paper generalizes the analysis to networks with greater depth (with no activations) and the exponential loss. The two main networks considered by the authors and the corresponding results are as follows. Linear Fully Connected Networks - In this setting the authors show that gradient descent in the limit converges to a predictor which in direction is the max margin predictor. This behaviour is the same as what was established in the earlier paper of Soudry et al for one layer neural networks.
Neural Information Processing Systems
Oct-7-2024, 05:01:46 GMT
- Technology: