ReLU Network Approximation in Terms of Intrinsic Parameters

Shen, Zuowei, Yang, Haizhao, Zhang, Shijun

Nov-15-2021–arXiv.org Machine Learning

This paper studies the approximation error of ReLU networks in terms of the number of intrinsic parameters (i.e., those depending on the target function $f$). First, we prove by construction that, for any Lipschitz continuous function $f$ on $[0,1]^d$ with a Lipschitz constant $\lambda>0$, a ReLU network with $n+2$ intrinsic parameters can approximate $f$ with an exponentially small error $5\lambda \sqrt{d}\,2^{-n}$ measured in the $L^p$-norm for $p\in [1,\infty)$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $\omega_f(\cdot)$, the approximation error is $\omega_f(\sqrt{d}\, 2^{-n})+2^{-n+2}\omega_f(\sqrt{d})$. Next, we extend these two results from the $L^p$-norm to the $L^\infty$-norm at a price of $3^d n+2$ intrinsic parameters. Finally, by using a high-precision binary representation and the bit extraction technique via a fixed ReLU network independent of the target function, we design, theoretically, a ReLU network with only three intrinsic parameters to approximate H\"older continuous functions with an arbitrarily small error.

intrinsic parameter, relu network, theorem 1, (15 more...)

arXiv.org Machine Learning

Nov-15-2021

arXiv.org PDF

Add feedback

Country:
- Asia > Singapore (0.04)
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - California > Santa Clara County
      - Mountain View (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Slovenia > Upper Carniola
    - Municipality of Bled > Bled (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (0.69)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.46)