AITopics | sharpness minimization algorithm

0354767c6386386be17cabe4fc59711b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 07:08:15 GMT

arxiv preprint arxiv, generalization, sharpness, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Neural Information Processing SystemsDec-23-2025, 17:36:24 GMT

Despite extensive studies, the underlying reason as to why overparameterizedneural networks can generalize remains elusive. Existing theory shows that common stochastic optimizers prefer flatter minimizers of the training loss, and thusa natural potential explanation is that flatness implies generalization. This workcritically examines this explanation. Through theoretical and empirical investigation, we identify the following three scenarios for two-layer ReLU networks: (1)flatness provably implies generalization; (2) there exist non-generalizing flattestmodels and sharpness minimization algorithms fail to generalize poorly, and (3)perhaps most strikingly, there exist non-generalizing flattest models, but sharpnessminimization algorithms still generalize. Our results suggest that the relationshipbetween sharpness and generalization subtly depends on the data distributionsand the model architectures and sharpness minimization algorithms do not onlyminimize sharpness to achieve better generalization.

generalization, only minimize sharpness, sharpness minimization algorithm, (5 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence (0.59)

Add feedback

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Neural Information Processing SystemsOct-9-2024, 08:37:44 GMT

Despite extensive studies, the underlying reason as to why overparameterizedneural networks can generalize remains elusive. Existing theory shows that common stochastic optimizers prefer flatter minimizers of the training loss, and thusa natural potential explanation is that flatness implies generalization. This workcritically examines this explanation. Through theoretical and empirical investigation, we identify the following three scenarios for two-layer ReLU networks: (1)flatness provably implies generalization; (2) there exist non-generalizing flattestmodels and sharpness minimization algorithms fail to generalize poorly, and (3)perhaps most strikingly, there exist non-generalizing flattest models, but sharpnessminimization algorithms still generalize. Our results suggest that the relationshipbetween sharpness and generalization subtly depends on the data distributionsand the model architectures and sharpness minimization algorithms do not onlyminimize sharpness to achieve better generalization.

generalization, only minimize sharpness, sharpness minimization algorithm, (2 more...)

Neural Information Processing Systems

Country: North America > United States (0.30)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence (0.66)

Add feedback

Filters

Collaborating Authors

sharpness minimization algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

0354767c6386386be17cabe4fc59711b-Paper-Conference.pdf

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization