AITopics | universal approximation property

Collaborating Authors

universal approximation property

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

bd4a6d0563e0604510989eb8f9ff71f5-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 22:57:53 GMT

ablation study, architecture, main result, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

Broad stochastic configuration residual learning system for norm-convergent universal approximation

Su, Han, Li, Zhongyan, Liu, Wanquan

arXiv.org Artificial IntelligenceNov-21-2025

Universal approximation serves as the foundation of neural network learning algorithms. However, some networks establish their universal approximation property by demonstrating that the iterative errors converge in probability measure rather than the more rigorous norm convergence, which makes the universal approximation property of randomized learning networks highly sensitive to random parameter selection, Broad residual learning system (BRLS), as a member of randomized learning models, also encounters this issue. We theoretically demonstrate the limitation of its universal approximation property, that is, the iterative errors do not satisfy norm convergence if the selection of random parameters is inappropriate and the convergence rate meets certain conditions. To address this issue, we propose the broad stochastic configuration residual learning system (BSCRLS) algorithm, which features a novel supervisory mechanism adaptively constraining the range settings of random parameters on the basis of BRLS framework, Furthermore, we prove the universal approximation theorem of BSCRLS based on the more stringent norm convergence. Three versions of incremental BSCRLS algorithms are presented to satisfy the application requirements of various network updates. Solar panels dust detection experiments are performed on publicly available dataset and compared with 13 deep and broad learning algorithms. Experimental results reveal the effectiveness and superiority of BSCRLS algorithms.

artificial intelligence, machine learning, universal approximation property, (14 more...)

arXiv.org Artificial Intelligence

2511.1655

Country:

Asia > China (0.29)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Solar (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximation theory for 1-Lipschitz ResNets

Murari, Davide, Furuya, Takashi, Schönlieb, Carola-Bibiane

arXiv.org Artificial IntelligenceOct-14-2025

1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set of scalar 1-Lipschitz functions on any compact domain when width and depth are allowed to grow. We also show that these networks can exactly represent scalar piecewise affine 1-Lipschitz functions. We then prove a stronger statement: by inserting norm-constrained linear maps between the residual blocks, the same density holds when the hidden width is fixed. Because every layer obeys simple norm constraints, the resulting models can be trained with off-the-shelf optimisers. This paper provides the first universal approximation guarantees for 1-Lipschitz ResNets, laying a rigorous foundation for their practical use.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.12003

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Appendix A Universal Approximation Here we show in Proposition 1 that our Combiner-X achieves universal approximation property [

Neural Information Processing SystemsAug-17-2025, 03:37:55 GMT

As discussed in section 4.4.

architecture, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

The Influence of the Memory Capacity of Neural DDEs on the Universal Approximation Property

Kuehn, Christian, Kuntz, Sara-Viola

arXiv.org Artificial IntelligenceJun-9-2025

Neural Ordinary Differential Equations (Neural ODEs), which are the continuous-time analog of Residual Neural Networks (ResNets), have gained significant attention in recent years. Similarly, Neural Delay Differential Equations (Neural DDEs) can be interpreted as an infinite depth limit of Densely Connected Residual Neural Networks (DenseResNets). In contrast to traditional ResNet architectures, DenseResNets are feed-forward networks that allow for shortcut connections across all layers. These additional connections introduce memory in the network architecture, as typical in many modern architectures. In this work, we explore how the memory capacity in neural DDEs influences the universal approximation property. The key parameter for studying the memory capacity is the product $K τ$ of the Lipschitz constant and the delay of the DDE. In the case of non-augmented architectures, where the network width is not larger than the input and output dimensions, neural ODEs and classical feed-forward neural networks cannot have the universal approximation property. We show that if the memory capacity $Kτ$ is sufficiently small, the dynamics of the neural DDE can be approximated by a neural ODE. Consequently, non-augmented neural DDEs with a small memory capacity also lack the universal approximation property. In contrast, if the memory capacity $Kτ$ is sufficiently large, we can establish the universal approximation property of neural DDEs for continuous functions. If the neural DDE architecture is augmented, we can expand the parameter regions in which universal approximation is possible. Overall, our results show that by increasing the memory capacity $Kτ$, the infinite-dimensional phase space of DDEs with positive delay $τ>0$ is not sufficient to guarantee a direct jump transition to universal approximation, but only after a certain memory threshold, universal approximation holds.

artificial intelligence, dde, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2505.07244

Country: Europe (0.28)

Genre: Research Report > New Finding (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal approximation property of neural stochastic differential equations

Kwossek, Anna P., Prömel, David J., Teichmann, Josef

arXiv.org Machine LearningMar-20-2025

We identify various classes of neural networks that are able to approximate continuous functions locally uniformly subject to fixed global linear growth constraints. For such neural networks the associated neural stochastic differential equations can approximate general stochastic differential equations, both of It\^o diffusion type, arbitrarily well. Moreover, quantitative error estimates are derived for stochastic differential equations with sufficiently regular coefficients.

approximation property, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2503.16696

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Approximation properties of neural ODEs

De Marinis, Arturo, Murari, Davide, Celledoni, Elena, Guglielmi, Nicola, Owren, Brynjulf, Tudisco, Francesco

arXiv.org Artificial IntelligenceMar-19-2025

We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters are required to satisfy some constraints. In particular, we constrain the Lipschitz constant of the flow of the neural ODE to increase the stability of the shallow neural network, and we restrict the norm of the weight matrices of the linear layers to one to make sure that the restricted expansivity of the flow is not compensated by the increased expansivity of the linear layers. For this setting, we prove approximation bounds that tell us the accuracy to which we can approximate a continuous function with a shallow neural network with such constraints. We prove that the UAP holds if we consider only the constraint on the Lipschitz constant of the flow or the unit norm constraint on the weight matrices of the linear layers.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

2503.15696

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Italy > Abruzzo > L'Aquila Province > L'Aquila (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling

Dang, Gang, Wang, Dianhui

arXiv.org Machine LearningNov-25-2024

Recurrent stochastic configuration networks (RSCNs) have shown great potential in modelling nonlinear dynamic systems with uncertainties. This paper presents an RSCN with hybrid regularization to enhance both the learning capacity and generalization performance of the network. Given a set of temporal data, the well-known least absolute shrinkage and selection operator (LASSO) is employed to identify the significant order variables. Subsequently, an improved RSCN with L2 regularization is introduced to approximate the residuals between the output of the target plant and the LASSO model. The output weights are updated in real-time through a projection algorithm, facilitating a rapid response to dynamic changes within the system. A theoretical analysis of the universal approximation property is provided, contributing to the understanding of the network's effectiveness in representing various complex nonlinear functions. Experimental results from a nonlinear system identification problem and two industrial predictive tasks demonstrate that the proposed method outperforms other models across all testing datasets.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2412.0007

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > China > Jiangsu Province > Xuzhou (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.94)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)

Add feedback

Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning

Borde, Haitz Sáez de Ocáriz, Lukoianov, Artem, Kratsios, Anastasis, Bronstein, Michael, Dong, Xiaowen

arXiv.org Artificial IntelligenceOct-29-2024

Traditionally, Graph Neural Networks (GNNs) [1] have primarily been applied to model functions over graphs with a relatively modest number of nodes. However, recently there has been a growing interest in exploring the application of GNNs to large-scale graph benchmarks, including datasets with up to a hundred million nodes [2]. This exploration could potentially lead to better models for industrial applications such as large-scale network analysis in social media, where there are typically millions of users, or in biology, where proteins and other macromolecules are composed of a large number of atoms. This presents a significant challenge in designing GNNs that are scalable while retaining their effectiveness. To this end, we take inspiration from the literature on Large Language Models (LLMs) and propose a simple modification to how GNN architectures are typically arranged. Our framework, Scalable Message Passing Neural Networks (SMPNNs), enables the construction of deep and scalable architectures that outperform the current state-of-the-art models for large graph benchmarks in transductive classification. More specifically, we find that following the typical construction of the Pre-Layer Normalization (Pre-LN) Transformer formulation [3] and replacing attention with standard message-passing convolution is enough to outperform the best Graph Transformers in the literature. Moreover, since our formulation does not necessarily require attention, our architecture scales better than Graph Transformers.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.00835

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Government (0.49)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Personality Disorder > Narcissistic Personality Disorder (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filters

Collaborating Authors

universal approximation property

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

bd4a6d0563e0604510989eb8f9ff71f5-Supplemental.pdf

Broad stochastic configuration residual learning system for norm-convergent universal approximation

Approximation theory for 1-Lipschitz ResNets

2290a7385ed77cc5592dc2153229f082-Paper.pdf

Appendix A Universal Approximation Here we show in Proposition 1 that our Combiner-X achieves universal approximation property [

The Influence of the Memory Capacity of Neural DDEs on the Universal Approximation Property

Universal approximation property of neural stochastic differential equations

Approximation properties of neural ODEs

Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling

Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning