AITopics | generalization gap

Collaborating Authors

generalization gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Train longer, generalize better: closing the generalization gap in large batch training of neural networks

Neural Information Processing SystemsMar-17-2026, 16:46:33 GMT

Background: Deep learning models are typically trained using stochastic gradient descent or one of its variants. These methods update the weights using their gradient, estimated from a small fraction of the training data. It has been observed that when using large batch sizes there is a persistent degradation in generalization performance - known as the generalization gap phenomenon. Identifying the origin of this gap and closing it had remained an open problem. Contributions: We examine the initial high learning rate training phase.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

Appendix A Proof of Theorem 2.1

Neural Information Processing SystemsFeb-19-2026, 02:42:03 GMT

We have the following lemma. Using the notation of Lemma A.1, we have E The third inequality uses the Lipschitz assumption of the loss function. Figure 10 supplements'Relation to disagreement ' at the end of Section 2. It shows an example where the behavior of inconsistency is different from disagreement. All the experiments were done using GPUs (A100 or older). The goal of the experiments reported in Section 3.1 was to find whether/how the predictiveness of The arrows indicate the direction of training becoming longer.

artificial intelligence, machine learning, procedure, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

c3266c14d7eb9715d9fad4306133aa4e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Finland (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Error Bounds for Learning with Vector-Valued Random Features

S. Lanthaler, N. H. Nelsen

Neural Information Processing SystemsFeb-17-2026, 15:19:46 GMT

The coefficients in this RF expansion are optimized to fit the given data of input-output pairs.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

ab54dd3cd5e273dbb1e75a5e6203fb42-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 08:34:28 GMT

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Explore to Generalize in Zero-Shot RL

Neural Information Processing SystemsFeb-17-2026, 01:04:43 GMT

Recent developments in reinforcement learning (RL) led to algorithms that surpass human experts in a broad range of tasks [Mnih et al., 2015, Vinyals et al., 2019, Schrittwieser et al., 2020, Wurman et al.,

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language (0.82)

Add feedback

On the Limitations of Fractal Dimension as a Measure of Generalization Charlie B. Tan University of Oxford Inés García-Redondo Imperial College London Qiquan Wang

Neural Information Processing SystemsFeb-15-2026, 17:21:17 GMT

Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. There is a recent and growing body of literature that proposes the framework of fractals to model optimization trajectories of neural networks, motivating generalization bounds and measures based on the fractal dimension of the trajectory. Notably, the persistent homology dimension has been proposed to correlate with the generalization gap.

artificial intelligence, dimension, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.40)
North America > United States > California (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Higher Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

Marco Ciccone, Marco Gallieri, Jonathan Masci, Christian Osendorfer, Faustino Gomez

Neural Information Processing SystemsFeb-13-2026, 09:20:12 GMT

Each block represents atime-invariant iterativeprocess as the first layer in thei-th block,xi(1), is unrolled into a pattern-dependent number,Ki, of processing stages, using weight matricesAi andBi. The skip connections from the input,ui, to all layers in blockimake the process nonautonomous. Blocks can be chained together (each block modeling adifferent latent space) by passing final latentrepresentation,xi(Ki),ofblockiastheinputtoblocki+1.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: