Normalization Layers Are All That Sharpness-Aware Minimization Needs Maximilian Müller University of Tübingen and Tübingen AI Center

Neural Information Processing Systems 

In this work we show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found