38db3aed920cf82ab059bfccbd02be6a-Reviews.html
–Neural Information Processing Systems
It is know that adding an additive gaussian noise to the feature is equivalent to an l_2 regularization in a least square problem (Bishop). This paper studies multiplicative Bernoulli feature noising, in a shallow learning architecture, with a general loss function and shows that it has the effect of adapting the geometry through an l_2 regularizer that rescales the feature (beta^{\top} D(beta,X) beta). The Matrix D(beta,X) is a estimate of the inverse diagonal fisher information. It is worth noting that D does not depend on the labels. The equivalent regularizer of dropout is non convex in general.
Neural Information Processing Systems
Oct-3-2025, 09:08:34 GMT
- Country:
- North America > United States > Nevada (0.04)
- Genre:
- Research Report > New Finding (0.30)
- Summary/Review (0.48)
- Technology: