Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units
–Neural Information Processing Systems
This paper presents a general framework for norm-based capacity control for $L_{p,q}$ weight normalized deep neural networks. We establish the upper bound on the Rademacher complexities of this family. With an $L_{p,q}$ normalization where $q\le p^*$ and $1/p+1/p^{*}=1$, we discuss properties of a width-independent capacity control, which only depends on the depth by a square root term. We further analyze the approximation properties of $L_{p,q}$ weight normalized deep neural networks. In particular, for an $L_{1,\infty}$ weight normalized network, the approximation error can be controlled by the $L_1$ norm of the output layer, and the corresponding generalization error only depends on the architecture by the square root of the depth.
Neural Information Processing Systems
Dec-31-2018
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- Indiana > Tippecanoe County
- Lafayette (0.04)
- West Lafayette (0.04)
- New York > New York County
- New York City (0.04)
- Indiana > Tippecanoe County
- Canada > Quebec
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Europe > United Kingdom
- Technology: