38db3aed920cf82ab059bfccbd02be6a-Reviews.html

Oct-3-2025, 09:08:34 GMT–Neural Information Processing Systems

It is know that adding an additive gaussian noise to the feature is equivalent to an l_2 regularization in a least square problem (Bishop). This paper studies multiplicative Bernoulli feature noising, in a shallow learning architecture, with a general loss function and shows that it has the effect of adapting the geometry through an l_2 regularizer that rescales the feature (beta^{\top} D(beta,X) beta). The Matrix D(beta,X) is a estimate of the inverse diagonal fisher information. It is worth noting that D does not depend on the labels. The equivalent regularizer of dropout is non convex in general.

delta, dropout, regularizer, (14 more...)

Neural Information Processing Systems

Oct-3-2025, 09:08:34 GMT

Conferences Web Page

Add feedback

Country:
- North America > United States > Nevada (0.04)

Genre:
- Summary/Review (0.48)
- Research Report > New Finding (0.30)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)