On Dropout and Nuclear Norm Regularization
–arXiv.org Artificial Intelligence
We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss. We show that (a) the explicit regularizer is composed of an $\ell_2$-path regularizer and other terms that are also re-scaling invariant, (b) the convex envelope of the induced regularizer is the squared nuclear norm of the network map, and (c) for a sufficiently large dropout rate, we characterize the global optima of the dropout objective. We validate our theoretical findings with empirical results.
arXiv.org Artificial Intelligence
May-28-2019
- Country:
- Asia > China (0.04)
- North America > United States
- Maryland > Baltimore (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Genre:
- Research Report (0.40)
- Technology: