AITopics | flatness measure

Collaborating Authors

flatness measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Relative Flatness and Generalization

Neural Information Processing SystemsDec-24-2025, 13:28:35 GMT

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with generalization, it is still an open theoretical problem why and under which circumstances flatness is connected to generalization, in particular in light of reparameterizations that change certain flatness measures but leave generalization unchanged. We investigate the connection between flatness and generalization by relating it to the interpolation from representative data, deriving notions of representativeness, and feature robustness. The notions allow us to rigorously connect flatness and generalization and to identify conditions under which the connection holds. Moreover, they give rise to a novel, but natural relative flatness measure that correlates strongly with generalization, simplifies to ridge regression for ordinary least squares, and solves the reparameterization issue.

electronic proceedings, name change, relative flatness and generalization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Relative Flatness and Generalization

Neural Information Processing SystemsNov-15-2025, 05:52:23 GMT

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.

feature robustness, flatness, generalization, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia (0.04)
Europe > Sweden (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Relative Flatness and Generalization

Neural Information Processing SystemsNov-15-2025, 05:52:19 GMT

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.

feature robustness, flatness, generalization, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia (0.04)
Europe > Sweden (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Relative Flatness and Generalization

Neural Information Processing SystemsAug-16-2025, 06:55:53 GMT

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.

artificial intelligence, flatness, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia (0.04)
Europe > Sweden (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Relative Flatness and Generalization

Neural Information Processing SystemsAug-16-2025, 06:55:49 GMT

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.

artificial intelligence, generalization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia (0.04)
Europe > Sweden (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Relative Flatness and Generalization

Neural Information Processing SystemsJan-17-2025, 23:01:11 GMT

flatness measure, relative flatness and generalization

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

FAM: Relative Flatness Aware Minimization

Adilova, Linara, Abourayya, Amr, Li, Jianning, Dada, Amin, Petzka, Henning, Egger, Jan, Kleesiek, Jens, Kamp, Michael

arXiv.org Artificial IntelligenceJul-5-2023

Flatness of the loss curve around a model at hand has been shown to empirically correlate with its generalization ability. Optimizing for flatness has been proposed as early as 1994 by Hochreiter and Schmidthuber, and was followed by more recent successful sharpness-aware optimization techniques. Their widespread adoption in practice, though, is dubious because of the lack of theoretically grounded connection between flatness and generalization, in particular in light of the reparameterization curse - certain reparameterizations of a neural network change most flatness measures but do not change generalization. Recent theoretical work suggests that a particular relative flatness measure can be connected to generalization and solves the reparameterization curse. In this paper, we derive a regularizer based on this relative flatness that is easy to compute, fast, efficient, and works with arbitrary loss functions. It requires computing the Hessian only of a single layer of the network, which makes it applicable to large neural networks, and with it avoids an expensive mapping of the loss surface in the vicinity of the model. In an extensive empirical evaluation we show that this relative flatness aware minimization (FAM) improves generalization in a multitude of applications and models, both in finetuning and standard training. We make the code available at github.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.02337

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Relating Adversarially Robust Generalization to Flat Minima

Stutz, David, Hein, Matthias, Schiele, Bernt

arXiv.org Machine LearningApr-9-2021

Adversarial training (AT) has become the de-facto standard to obtain models robust against adversarial examples. However, AT exhibits severe robust overfitting: cross-entropy loss on adversarial examples, so-called robust loss, decreases continuously on training examples, while eventually increasing on test examples. In practice, this leads to poor robust generalization, i.e., adversarial robustness does not generalize well to new examples. In this paper, we study the relationship between robust generalization and flatness of the robust loss landscape in weight space, i.e., whether robust loss changes significantly when perturbing weights. To this end, we propose average- and worst-case metrics to measure flatness in the robust loss landscape and show a correlation between good robust generalization and flatness. For example, throughout training, flatness reduces significantly during overfitting such that early stopping effectively finds flatter minima in the robust loss landscape. Similarly, AT variants achieving higher adversarial robustness also correspond to flatter minima. This holds for many popular choices, e.g., AT-AWP, TRADES, MART, AT with self-supervision or additional unlabeled examples, as well as simple regularization techniques, e.g., AutoAugment, weight decay or label noise. For fair comparison across these approaches, our flatness measures are specifically designed to be scale-invariant and we conduct extensive experiments to validate our findings.

deep learning, neural network, rloss, (20 more...)

arXiv.org Machine Learning

2104.04448

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.67)
Energy > Oil & Gas > Midstream (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

What can flatness teach us: understanding generalisation in Deep Neural Networks

#artificialintelligenceMar-29-2021, 20:05:11 GMT

This is the third post in a series summarising work that seeks to provide a theory of generalisation in Deep Neural Networks (DNNs). Briefly, the first post summarises evidence that DNNs trained with stochastic optimisers (like SGD) find functions with probability proportional to their volume in parameter-space, and the second post argues that these high-volume functions are'simple', thus explaining why DNNs generalise. In the following, we summarise results in [1] which explain why the'flatness of the loss landscape' has been shown to correlate with generalisation -- a well-known result (see e.g. They provide substantial empirical evidence that this correlation is actually a combination of (1) a weak correlation between the local flatness and the volume of the surrounding function, and (2) a strong correlation between volume and generalisation. This combination produces a weak correlation between'flatness' and generalisation.

correlation, deep neural network, generalisation, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Why Flatness Correlates With Generalization For Deep Neural Networks

Zhang, Shuofeng, Reid, Isaac, Pérez, Guillermo Valle, Louis, Ard

arXiv.org Machine LearningMar-10-2021

The intuition that local flatness of the loss landscape is correlated with better generalization for deep neural networks (DNNs) has been explored for decades, spawning many different local flatness measures. Here we argue that these measures correlate with generalization because they are local approximations to a global property, the volume of the set of parameters mapping to a specific function. This global volume is equivalent to the Bayesian prior upon initialization. For functions that give zero error on a test set, it is directly proportional to the Bayesian posterior, making volume a more robust and theoretically better grounded predictor of generalization than flatness. Whilst flatness measures fail under parameter re-scaling, volume remains invariant and therefore continues to correlate well with generalization. Moreover, some variants of SGD can break the flatness-generalization correlation, while the volume-generalization correlation remains intact.

correlation, generalization, sharpness, (13 more...)

arXiv.org Machine Learning

2103.06219

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback