Goto

Collaborating Authors

 Statistical Learning


S)GD over Diagonal Linear Networks Implicit Bias Large and Edge of Stability

Neural Information Processing Systems

Currently, most theoretical works on implicit regularisation have primarily focused on continuous time approximations of (S)GD where the impact of crucial hyperparameters such as the stepsize and the minibatch size are ignored. One such common simplification is to analyse gradient flow, which is a continuous time limit of GD and minibatch SGD with an infinitesimal stepsize. By definition, this analysis does not capture the effect of stepsize or stochasticity.





204da255aea2cd4a75ace6018fad6b4d-Paper.pdf

Neural Information Processing Systems

In this paper, we consider various tree constructions and examine how the choice of parameters affects the generalization error of the resulting random forests as the sample size goes to infinity.



TowardsPracticalFew-ShotQuerySets: TransductiveMinimumDescriptionLengthInference

Neural Information Processing Systems

Inparticular,foreach task at testing time, theclasses effectivelypresent intheunlabeled query setareknown a priori, and correspond exactly to the set of classes represented in the labeled supportset.