Goto

Collaborating Authors

 manifold







The Crucial Role of Normalization in Sharpness-Aware Minimization Yan Dai

Neural Information Processing Systems

Sharpness-A ware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks.





Robust low-rank training via approximate orthonormal constraints

Neural Information Processing Systems

By modeling robustness in terms of the condition number of the neural network, we argue that this loss of robustness is due to the exploding singular values of the low-rank weight matrices.