A Proofs

Neural Information Processing Systems 

Eq. (4) and the fact that σ () is 1-Lipschitz on [ 1, +1], we get the expression b n null σ (δ) + σ ( δ) 2 null 1 2 p Recalling Eq. (3), we get that by fixing s = Considering this constraint in Eq. (6), we see that for any choice of To continue, it will be convenient to get rid of the absolute value in the displayed expression above. Considering Eq. (8), this is Fortunately, the Rademacher complexity of such composed classes was analyzed in Golowich et al. [2017] for a Rademacher complexity of H . 15 To complete the proof, we need to employ a standard upper bound on Upper bounding this by ϵ, solving for m and simplifying a bit, the result follows. By Markov's inequality, it follows that with probability at least Thm. 3 implies that a certain dataset A.5 Proofs of Thm. 4 and Thm. 5 In what follows, given a vector u Lemma 4. Given a vector Upper bounding this by ϵ and solving for m, the result follows. We now utilize equation (4.20) in Ledoux and Talagrand [1991], which implies We now turn to prove the theorem. Cauchy-Schwartz and Jensen's inequalities, this in turn can be upper bounded as follows: E The proof follows from a covering number argument.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found