Goto

Collaborating Authors

 approximation result




New Approximation Results and Optimal Estimation for Fully Connected Deep Neural Networks

Tang, Zhaoji

arXiv.org Machine Learning

\citet{farrell2021deep} establish non-asymptotic high-probability bounds for general deep feedforward neural network (with rectified linear unit activation function) estimators, with \citet[Theorem 1]{farrell2021deep} achieving a suboptimal convergence rate for fully connected feedforward networks. The authors suggest that improved approximation of fully connected networks could yield sharper versions of \citet[Theorem 1]{farrell2021deep} without altering the theoretical framework. By deriving approximation bounds specifically for a narrower fully connected deep neural network, this note demonstrates that \citet[Theorem 1]{farrell2021deep} can be improved to achieve an optimal rate (up to a logarithmic factor). Furthermore, this note briefly shows that deep neural network estimators can mitigate the curse of dimensionality for functions with compositional structure and functions defined on manifolds.


Constructive Approximation under Carleman's Condition, with Applications to Smoothed Analysis

Koehler, Frederic, Wu, Beining

arXiv.org Machine Learning

A classical result of Carleman, based on the theory of quasianalytic functions, shows that polynomials are dense in $L^2(μ)$ for any $μ$ such that the moments $\int x^k dμ$ do not grow too rapidly as $k \to \infty$. In this work, we develop a fairly tight quantitative analogue of the underlying Denjoy-Carleman theorem via complex analysis, and show that this allows for nonasymptotic control of the rate of approximation by polynomials for any smooth function with polynomial growth at infinity. In many cases, this allows us to establish $L^2$ approximation-theoretic results for functions over general classes of distributions (e.g., multivariate sub-Gaussian or sub-exponential distributions) which were previously known only in special cases. As one application, we show that the Paley--Wiener class of functions bandlimited to $[-Ω,Ω]$ admits superexponential rates of approximation over all strictly sub-exponential distributions, which leads to a new characterization of the class. As another application, we solve an open problem recently posed by Chandrasekaran, Klivans, Kontonis, Meka and Stavropoulos on the smoothed analysis of learning, and also obtain quantitative improvements to their main results and applications.


A general technique for approximating high-dimensional empirical kernel matrices

Kaushik, Chiraag, Romberg, Justin, Muthukumar, Vidya

arXiv.org Machine Learning

We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(\cdot,\cdot)$. Our approach uses decoupling results for U-statistics and the non-commutative Khintchine inequality to obtain upper and lower bounds depending only on scalar statistics of the kernel function and a ``correlation kernel'' matrix corresponding to $k(\cdot,\cdot)$. We then apply our method to provide new, tighter approximations for inner-product kernel matrices on general high-dimensional data, where the sample size and data dimension are polynomially related. Our method obtains simplified proofs of existing results that rely on the moment method and combinatorial arguments while also providing novel approximation results for the case of anisotropic Gaussian data. Finally, using similar techniques to our approximation result, we show a tighter lower bound on the bias of kernel regression with anisotropic Gaussian data.




$\mathcal{C}^1$-approximation with rational functions and rational neural networks

Morina, Erion, Holler, Martin

arXiv.org Artificial Intelligence

We show that suitably regular functions can be approximated in the $\mathcal{C}^1$-norm both with rational functions and rational neural networks, including approximation rates with respect to width and depth of the network, and degree of the rational functions. As consequence of our results, we further obtain $\mathcal{C}^1$-approximation results for rational neural networks with the $\text{EQL}^÷$ and ParFam architecture, both of which are important in particular in the context of symbolic regression for physical law learning.