Effective Sharpness A ware Minimization Requires Layerwise Perturbation Scaling
–Neural Information Processing Systems
Our findings reveal that the dynamics of standard SAM effectively reduce to applying SAM solely in the last layer in wide neural networks, even with optimal hyperparameters.
Neural Information Processing Systems
Oct-11-2025, 00:18:59 GMT
- Country:
- North America > United States (0.13)
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Germany > Baden-Württemberg
- Tübingen Region > Tübingen (0.14)
- United Kingdom > England
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Technology: