AITopics

2506.07884

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Pradhan, Satyaranjan, Soren, Madan Mohan

Convergence Analysis of Max-Min Exponential Neural Network Operators in Orlicz Space

arXiv.org Artificial IntelligenceAug-15-2025

In this current work, we propose a Max-Min approach for approximating functions using exponential neural network operators. We extend this framework to develop the Max-Min Kantorovich-type exponential neural network operators and investigate their approximation properties. We study both pointwise and uniform convergence for univariate functions. To analyze the order of convergence, we use the logarithmic modulus of continuity and estimate the corresponding rate of convergence. Furthermore, we examine the convergence behavior of the Max-Min Kantorovich-type exponential neural network operators within the Orlicz space setting. We provide some graphical representations to illustrate the approximation error of the function through suitable kernel and sigmoidal activation functions.

artificial intelligence, machine learning, operator, (14 more...)

2508.10248

Country: Asia (0.28)

Genre: Research Report (0.50)

Industry:

Telecommunications > Networks (1.00)
Information Technology > Networks (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMay-20-2025

Parallel Layer Normalization for Universal Approximation

Ni, Yunhao, Liu, Yuhe, Sun, Wenxin, Tang, Yitong, Guo, Yuxin, Feng, Peilin, Wu, Wenjun, Huang, Lei

Universal approximation theorem (UAT) is a fundamental theory for deep neural networks (DNNs), demonstrating their powerful representation capacity to represent and approximate any function. The analyses and proofs of UAT are based on traditional network with only linear and nonlinear activation functions, but omitting normalization layers, which are commonly employed to enhance the training of modern networks. This paper conducts research on UAT of DNNs with normalization layers for the first time. We theoretically prove that an infinitely wide network -- composed solely of parallel layer normalization (PLN) and linear layers -- has universal approximation capacity. Additionally, we investigate the minimum number of neurons required to approximate $L$-Lipchitz continuous functions, with a single hidden-layer network. We compare the approximation capacity of PLN with traditional activation functions in theory. Different from the traditional activation functions, we identify that PLN can act as both activation function and normalization in deep neural networks at the same time. We also find that PLN can improve the performance when replacing LN in transformer architectures, which reveals the potential of PLN used in neural architectures.

artificial intelligence, deep learning, machine learning, (18 more...)

2505.13142

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-14-2024

An elementary proof of a universal approximation theorem

Monico, Chris

There are several versions of universal approximation theorems known, including the very well-known ones from [1, 2, 3]. Each of them states that some collection of neural networks is dense in some space of continuous functions with respect to the uniform norm. In this short note, we present what we believe to be a new and atypically elementary proof of one such theorem. If σ is a 0-1 squashing function (a.k.a. a sigmoidal function), we show that the collection of neural networks with three hidden layers and activation function σ (except at the output) is dense in the space C(K) of real-valued continuous functions on a compact set K R

continuous function, elementary proof, universal approximation theorem, (12 more...)

2406.10002

Country: North America > United States > Texas (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-17-2023

Morph-SSL: Self-Supervision with Longitudinal Morphing to Predict AMD Progression from OCT

Chakravarty, Arunava, Emre, Taha, Leingang, Oliver, Riedl, Sophie, Mai, Julia, Scholl, Hendrik P. N., Sivaprasad, Sobha, Rueckert, Daniel, Lotery, Andrew, Schmidt-Erfurth, Ursula, Bogunović, Hrvoje

The lack of reliable biomarkers makes predicting the conversion from intermediate to neovascular age-related macular degeneration (iAMD, nAMD) a challenging task. We develop a Deep Learning (DL) model to predict the future risk of conversion of an eye from iAMD to nAMD from its current OCT scan. Although eye clinics generate vast amounts of longitudinal OCT scans to monitor AMD progression, only a small subset can be manually labeled for supervised DL. To address this issue, we propose Morph-SSL, a novel Self-supervised Learning (SSL) method for longitudinal data. It uses pairs of unlabelled OCT scans from different visits and involves morphing the scan from the previous visit to the next. The Decoder predicts the transformation for morphing and ensures a smooth feature manifold that can generate intermediate scans between visits through linear interpolation. Next, the Morph-SSL trained features are input to a Classifier which is trained in a supervised manner to model the cumulative probability distribution of the time to conversion with a sigmoidal function. Morph-SSL was trained on unlabelled scans of 399 eyes (3570 visits). The Classifier was evaluated with a five-fold cross-validation on 2418 scans from 343 eyes with clinical labels of the conversion date. The Morph-SSL features achieved an AUC of 0.766 in predicting the conversion to nAMD within the next 6 months, outperforming the same network when trained end-to-end from scratch or pre-trained with popular SSL methods. Automated prediction of the future risk of nAMD onset can enable timely treatment and individualized AMD management.

artificial intelligence, machine learning, morph-ssl, (18 more...)

2304.08439

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.88)

arXiv.org Artificial IntelligenceSep-27-2022

Continuous approximation by convolutional neural networks with a sigmoidal function

Chang, Weike

In this paper we present a class of convolutional neural networks (CNNs) called non-overlapping CNNs in the study of approximation capabilities of CNNs. We prove that such networks with sigmoidal activation function are capable of approximating arbitrary continuous function defined on compact input sets with any desired degree of accuracy. This result extends existing results where only multilayer feedforward networks are a class of approximators. Evaluations elucidate the accuracy and efficiency of our result and indicate that the proposed non-overlapping CNNs are less sensitive to noise.

artificial intelligence, machine learning, non-overlapping cnn, (16 more...)

2209.13332

Country:

Asia > China > Jiangxi Province > Nanchang (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

arXiv.org Machine LearningFeb-9-2022

Approximation error of single hidden layer neural networks with fixed weights

Ismailov, Vugar

This paper provides an explicit formula for the approximation error of single hidden layer neural networks with two fixed weights.

approximation, layer neural network, neural network, (15 more...)

2202.03289

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJan-7-2022

A Unified and Constructive Framework for the Universality of Neural Networks

Bui-Thanh, Tan

One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive framework for neural network universality remains unavailable. This paper is an effort to provide a unified and constructive framework for the universality of a large class of activations including most of existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is: {\em any nAI activation function is universal}. It turns out that most of existing activations are nAI, and thus universal in the space of continuous functions on compacta. The framework has the following main properties. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is the first unified attempt that is valid for most of existing activations. Third, as a by product, the framework provides the first university proof for some of the existing activation functions including Mish, SiLU, ELU, GELU, and etc. Fourth, it provides new proofs for most activation functions. Fifth, it discovers new activations with guaranteed universality property. Sixth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with predetermined number of neurons, and the values of weights/biases. Seventh, the framework allows us to abstractly present the first universal approximation with favorable non-asymptotic rate.

activation function, neural network, sigmoidal function, (14 more...)

2112.14877

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Guo, Erdong, Draper, David

Representation Theorem for Matrix Product States

arXiv.org Machine LearningMar-15-2021

In this work, we investigate the universal representation capacity of the Matrix Product States (MPS) from the perspective of boolean functions and continuous functions. We show that MPS can accurately realize arbitrary boolean functions by providing a construction method of the corresponding MPS structure for an arbitrarily given boolean gate. Moreover, we prove that the function space of MPS with the scale-invariant sigmoidal activation is dense in the space of continuous functions defined on a compact subspace of the $n$-dimensional real coordinate space $\mathbb{R^{n}}$. We study the relation between MPS and neural networks and show that the MPS with a scale-invariant sigmoidal function is equivalent to a one-hidden-layer neural network equipped with a kernel function. We construct the equivalent neural networks for several specific MPS models and show that non-linear kernels such as the polynomial kernel which introduces the couplings between different components of the input into the model appear naturally in the equivalent neural networks. At last, we discuss the realization of the Gaussian Process (GP) with infinitely wide MPS by studying their equivalent neural networks.

boolean function, kernel function, neural network, (14 more...)

2103.08277

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceAug-19-2020, 04:00:46 GMT

The Math Behind Logistic Regression

Have you ever wondered how logistic regression works and how loss function is minimized by gradient descent? Have you ever wondered how logistic regression works and how loss function is minimized by gradient descent? This article is for you. Before starting with logistic regression, it is important to understand what is Supervised learning. Supervised learning is training the model on a dataset that contains a target(output) column.

artificial intelligence, machine learning, regression, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)