AITopics | geometric complexity

Collaborating Authors

geometric complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why neural networks find simple solutions: the many regularizers of geometric complexity

Neural Information Processing SystemsApr-24-2026, 14:57:07 GMT

In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep learning, it has been difficult to understand the underlying mechanisms of complexity control, since many traditional measures are not naturally suitable for deep neural networks. Here we develop the notion of geometric complexity, which is a measure of the variability of the model function, computed using a discrete Dirichlet energy. Using a combination of theoretical arguments and empirical results, we show that many common training heuristics such as parameter norm regularization, spectral norm regularization, flatness regularization, implicit gradient regularization, noise regularization and the choice of parameter initialization all act to control geometric complexity, providing a unifying framework in which to characterize the behavior of deep learning models.

artificial intelligence, machine learning, regularization, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Impact of Geometric Complexity on Neural Collapse in Transfer Learning

Neural Information Processing SystemsMar-21-2026, 06:13:40 GMT

Many of the recent advances in computer vision and language models can be attributed to the success of transfer learning via the pre-training of large foundation models. However, a theoretical framework which explains this empirical success is incomplete and remains an active area of research. Flatness of the loss surface and neural collapse have recently emerged as useful pre-training metrics which shed light on the implicit biases underlying pre-training. In this paper, we explore the geometric complexity of a model's learned representations as a fundamental mechanism that relates these two concepts. We show through experiments and theory that mechanisms which affect the geometric complexity of the pre-trained network also influence the neural collapse. Furthermore, we show how this effect of the geometric complexity generalizes to the neural collapse of new classes as well, thus encouraging better performance on downstream tasks, particularly in the few-shot setting.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

7b24015f3af598e1d9179f6e06353780-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 01:08:30 GMT

machine learning, natural language, neural collapse, (18 more...)

Neural Information Processing Systems

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

14d2d59c5e2302bfa6ce6cae59156e33-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 08:04:45 GMT

international conference, neural network, relu mpnn, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

0ff3502bb29570b219967278db150a50-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 12:05:22 GMT

complexity, neural network, regularization, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Penalizing Localized Dirichlet Energies in Low Rank Tensor Products

Karakasis, Paris A., Sidiropoulos, Nicholas D.

arXiv.org Machine LearningJan-21-2026

We study low-rank tensor-product B-spline (TPBS) models for regression tasks and investigate Dirichlet energy as a measure of smoothness. We show that TPBS models admit a closed-form expression for the Dirichlet energy, and reveal scenarios where perfect interpolation is possible with exponentially small Dirichlet energy. This renders global Dirichlet energy-based regularization ineffective. To address this limitation, we propose a novel regularization strategy based on local Dirichlet energies defined on small hypercubes centered at the training points. Leveraging pretrained TPBS models, we also introduce two estimators for inference from incomplete samples. Comparative experiments with neural networks demonstrate that TPBS models outperform neural networks in the overfitting regime for most datasets, and maintain competitive performance otherwise. Overall, TPBS models exhibit greater robustness to overfitting and consistently benefit from regularization, while neural networks are more sensitive to overfitting and less effective in leveraging regularization.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Machine Learning

2601.14173

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Why neural networks find simple solutions: The many regularizers of geometric complexity

Neural Information Processing SystemsDec-23-2025, 18:51:15 GMT

geometric complexity, neural network find simple solution, regularization, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

7b24015f3af598e1d9179f6e06353780-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 06:55:42 GMT

geometric complexity, neural collapse, regularization, (16 more...)

Neural Information Processing Systems

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

14d2d59c5e2302bfa6ce6cae59156e33-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 19:09:40 GMT

Unsurprisingly, GNNs as MPNNs have been analyzed from multiple theoretical perspectives.

international conference, neural network, relu mpnn, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

On residual network depth

Dherin, Benoit, Munn, Michael

arXiv.org Machine LearningOct-7-2025

Deep residual architectures, such as ResNet and the Transformer, have enabled models of unprecedented depth, yet a formal understanding of why depth is so effective remains an open question. A popular intuition, following Veit et al. (2016), is that these residual networks behave like ensembles of many shallower models. Our key finding is an explicit analytical formula that verifies this ensemble perspective, proving that increasing network depth is mathematically equivalent to expanding the size of this implicit ensemble. Furthermore, our expansion reveals a hierarchical ensemble structure in which the combinatorial growth of computation paths leads to an explosion in the output signal, explaining the historical necessity of normalization layers in training deep models. This insight offers a first principles explanation for the historical dependence on normalization layers and sheds new light on a family of successful normalization-free techniques like SkipInit and Fixup. However, while these previous approaches infer scaling factors through optimizer analysis or a heuristic analogy to Batch Normalization, our work offers the first explanation derived directly from the network's inherent functional structure. Specifically, our Residual Expansion Theorem reveals that scaling each residual module provides a principled solution to taming the combinatorial explosion inherent to these architectures. We further show that this scaling acts as a capacity controls that also implicitly regularizes the model's complexity.

complexity, ensemble, residual network, (14 more...)

arXiv.org Machine Learning

2510.0347

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback