Goto

Collaborating Authors

 rational network


max

Neural Information Processing Systems

Let 0 < < 1, 0 < ` < 1, k 1, and r be the Zolotarev sign function Z3k(;`)oftype(3k,3k 1). One finds using the Karush-Kuhn-Tucker conditions [6]thatk1 = = kM = λ. Proof of Lemma 2. Let 0 < < 1 and R: [ 1,1] [ 1,1] be a rational function. Take R(x) = R(2x 1), which is still a rational function. Without loss of generality, we can assume that R is an irreducible rational function (otherwise cancel factors till it is irreducible).



Supplementary Material of Rational neural networks

Neural Information Processing Systems

Finally, we use the identity ReLU( x) = |x | + x 2, x R, to define a rational approximation to the ReLU function on the interval [ 1, 1] as r (x) = 1 2 null xr ( x) 1 + null + x null . Therefore, we have the following inequalities for x [ 1, 1], | ReLU( x) r (x) | = 1 2 null null null null | x| xr ( x) 1 + null null null null null 1 2(1 + null) (||x | xr (x) | + null| x |) null 1 + null null. We now show that ReLU neural networks can approximate rational functions. The structure of the proof closely follows [12, Lemma 1.3]. The statement of Theorem 3 comes in two parts, and we prove them separately.



Data-driven discovery of Green's functions

Boullé, Nicolas

arXiv.org Artificial Intelligence

Discovering hidden partial differential equations (PDEs) and operators from data is an important topic at the frontier between machine learning and numerical analysis. This doctoral thesis introduces theoretical results and deep learning algorithms to learn Green's functions associated with linear partial differential equations and rigorously justify PDE learning techniques. A theoretically rigorous algorithm is derived to obtain a learning rate, which characterizes the amount of training data needed to approximately learn Green's functions associated with elliptic PDEs. The construction connects the fields of PDE learning and numerical linear algebra by extending the randomized singular value decomposition to non-standard Gaussian vectors and Hilbert--Schmidt operators, and exploiting the low-rank hierarchical structure of Green's functions using hierarchical matrices. Rational neural networks (NNs) are introduced and consist of neural networks with trainable rational activation functions. The highly compositional structure of these networks, combined with rational approximation theory, implies that rational functions have higher approximation power than standard activation functions. In addition, rational NNs may have poles and take arbitrarily large values, which is ideal for approximating functions with singularities such as Green's functions. Finally, theoretical results on Green's functions and rational NNs are combined to design a human-understandable deep learning method for discovering Green's functions from data. This approach complements state-of-the-art PDE learning techniques, as a wide range of physics can be captured from the learned Green's functions such as dominant modes, symmetries, and singularity locations.


Rational neural networks

Boullé, Nicolas, Nakatsukasa, Yuji, Townsend, Alex

arXiv.org Machine Learning

We consider neural networks with rational activation functions. The choice of the nonlinear activation function in deep learning architectures is crucial and heavily impacts the performance of a neural network. We establish optimal bounds in terms of network complexity and prove that rational neural networks approximate smooth functions more efficiently than ReLU networks. The flexibility and smoothness of rational activation functions make them an attractive alternative to ReLU, as we demonstrate with numerical experiments.