A Proofs

Neural Information Processing Systems 

A.1 Proof for Theorem 1 A.1.1 Proof for (I) and (II) First, observe that the constraint in Equation ( 3) can be equivalently replaced by an inequality constraint f Therefore, the Lagrangian multiplier can be restricted to be λ 0. We have L II) follows a straightforward calculation. Proof for (III), the strong duality We first introduce the following lemma, which is a straight forward generalization of the strong Lagrange duality to functional optimization case. The proof of Lemma 1 is standard. However, for completeness, we include it here. Notice that both sets A and B are convex.