approximation result
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
New Approximation Results and Optimal Estimation for Fully Connected Deep Neural Networks
\citet{farrell2021deep} establish non-asymptotic high-probability bounds for general deep feedforward neural network (with rectified linear unit activation function) estimators, with \citet[Theorem 1]{farrell2021deep} achieving a suboptimal convergence rate for fully connected feedforward networks. The authors suggest that improved approximation of fully connected networks could yield sharper versions of \citet[Theorem 1]{farrell2021deep} without altering the theoretical framework. By deriving approximation bounds specifically for a narrower fully connected deep neural network, this note demonstrates that \citet[Theorem 1]{farrell2021deep} can be improved to achieve an optimal rate (up to a logarithmic factor). Furthermore, this note briefly shows that deep neural network estimators can mitigate the curse of dimensionality for functions with compositional structure and functions defined on manifolds.
Constructive Approximation under Carleman's Condition, with Applications to Smoothed Analysis
Koehler, Frederic, Wu, Beining
A classical result of Carleman, based on the theory of quasianalytic functions, shows that polynomials are dense in $L^2(μ)$ for any $μ$ such that the moments $\int x^k dμ$ do not grow too rapidly as $k \to \infty$. In this work, we develop a fairly tight quantitative analogue of the underlying Denjoy-Carleman theorem via complex analysis, and show that this allows for nonasymptotic control of the rate of approximation by polynomials for any smooth function with polynomial growth at infinity. In many cases, this allows us to establish $L^2$ approximation-theoretic results for functions over general classes of distributions (e.g., multivariate sub-Gaussian or sub-exponential distributions) which were previously known only in special cases. As one application, we show that the Paley--Wiener class of functions bandlimited to $[-Ω,Ω]$ admits superexponential rates of approximation over all strictly sub-exponential distributions, which leads to a new characterization of the class. As another application, we solve an open problem recently posed by Chandrasekaran, Klivans, Kontonis, Meka and Stavropoulos on the smoothed analysis of learning, and also obtain quantitative improvements to their main results and applications.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Modeling & Simulation (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
A general technique for approximating high-dimensional empirical kernel matrices
Kaushik, Chiraag, Romberg, Justin, Muthukumar, Vidya
We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(\cdot,\cdot)$. Our approach uses decoupling results for U-statistics and the non-commutative Khintchine inequality to obtain upper and lower bounds depending only on scalar statistics of the kernel function and a ``correlation kernel'' matrix corresponding to $k(\cdot,\cdot)$. We then apply our method to provide new, tighter approximations for inner-product kernel matrices on general high-dimensional data, where the sample size and data dimension are polynomially related. Our method obtains simplified proofs of existing results that rely on the moment method and combinatorial arguments while also providing novel approximation results for the case of anisotropic Gaussian data. Finally, using similar techniques to our approximation result, we show a tighter lower bound on the bias of kernel regression with anisotropic Gaussian data.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- (3 more...)
$\mathcal{C}^1$-approximation with rational functions and rational neural networks
We show that suitably regular functions can be approximated in the $\mathcal{C}^1$-norm both with rational functions and rational neural networks, including approximation rates with respect to width and depth of the network, and degree of the rational functions. As consequence of our results, we further obtain $\mathcal{C}^1$-approximation results for rational neural networks with the $\text{EQL}^÷$ and ParFam architecture, both of which are important in particular in the context of symbolic regression for physical law learning.
- Europe > Austria > Styria > Graz (0.05)
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Virginia (0.04)
- North America > Canada (0.04)