AT Proofs

Nov-15-2025, 06:17:28 GMT–Neural Information Processing Systems

A.1 Proof of Proposition 1 Proof of Proposition 1. Recall that h denotes the vanilla activations of the network, those we obtain with no noise injection. Let us not inject noise in the final, predictive, layer of our network such that the noise on this layer is accumulated from the noising of previous layers. Let us first consider the Taylor series expansion of the loss function with the accumulated noise defined in Proposition 1. Denoting =[ This can be deduced from the slightly opaque Fa ` a di Bruno's formula, which states that for multivariate derivatives of a composition of functions f: R The final equality comes from the moments of a mean 0 Gaussian, where j takes the values of the multi-index. Though these equalities can already offer insight into the regularising mechanisms of GNIs, they are not easy to work with and will often be computationally intractable. We will include these terms in our remainder term C .

baseline, noise, regulariser, (13 more...)

Neural Information Processing Systems

Nov-15-2025, 06:17:28 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (0.69)
  - Neural Networks (0.68)

Duplicate Docs Excel Report

Title
c16a5320fa475530d9583c34fd356ef5-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found