Reviews: Implicit Bias of Gradient Descent on Linear Convolutional Networks

Oct-7-2024, 05:01:46 GMT–Neural Information Processing Systems

The paper considers the problem of formalizing the implicit bias of gradient descent on fully connected linear/convolutional networks with an exponential loss. Building on the recent work by Soudry et al. which considered a one layer neural network with no activation the paper generalizes the analysis to networks with greater depth (with no activations) and the exponential loss. The two main networks considered by the authors and the corresponding results are as follows. Linear Fully Connected Networks - In this setting the authors show that gradient descent in the limit converges to a predictor which in direction is the max margin predictor. This behaviour is the same as what was established in the earlier paper of Soudry et al for one layer neural networks.

convolutional network, gradient descent, implicit bias, (8 more...)

Neural Information Processing Systems

Oct-7-2024, 05:01:46 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (0.88)
  - Neural Networks (0.59)