Effective Sharpness A ware Minimization Requires Layerwise Perturbation Scaling
–Neural Information Processing Systems
Our findings reveal that the dynamics of standard SAM effectively reduce to applying SAM solely in the last layer in wide neural networks, even with optimal hyperparameters.
Neural Information Processing Systems
Oct-11-2025, 00:18:59 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Germany > Baden-Württemberg
- Tübingen Region > Tübingen (0.14)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Germany > Baden-Württemberg
- North America > United States (0.13)
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Technology: