A Proof of Proposition 2.5
–Neural Information Processing Systems
Proposition 2.5 is a direct consequence of the following lemma (remember that rh() =[@h()] Assume that @h() () =0for all 2 . Let us first show the direct inclusion. Now let us show the converse inclusion. Y. Let us show that [@ By taking X = x and Y = y (i.e. a data set of one feature and one target), one has still by chain rules rE We recall (cf Example 2.10 and Example 2.11) that linear and 2-layer ReLU neural networks satisfy Assumption 2.9, which we recall reads as: Assumption 2.9 (Local reparameterization) For each parameter To proceed further we will rely on the following lemma that shows a direct consequence of (9) (in addition to Assumption 2.9 on the model g(,)). Under Assumption 2.9, considering a loss `(z,y) such that `(,y) is C Before proceeding to the proof of Lemma 2.13 and Theorem 2.14, let us show that (9) holds for standard ML losses.
Neural Information Processing Systems
May-25-2025, 11:28:04 GMT
- Technology: