SupplementaryMaterial: StructuredPredictionfor ConditionalMeta-Learning

Neural Information Processing Systems 

This led tothe derivation ofamore general (and involved) characterization of the estimator ˆf. We recall that the distributionπ samples the two datasets according to the process described in Section 2, namely by first samplingρ a task-distribution (onX Y) from µ and then obtaining Dtr and Dval by independently sampling points(x,y) from ρ. Therforeπ = πµ can be seen as implicitlyinducedby µ. The loss4isoftheform(A.5)and admits derivatives ofany order,namely4 C (Z Y X). Assumption 2. Assume Θ Rd1 and D Rd2 compact sets satisfying the cone condition and assume that there exists a reproducing kernelk: D D R with associated RKHSF and s>(d1+2d2)/2suchthatthefunctiong:D HwithH=Ws,2(Θ D),characterizedby g (Dtr)= Z 4(,Dval|)dπ(Dval|Dtr) Dtr D, (A.7) is such that g H F and, for any D D, we have that the application of the operator T(g):F Htothefunctionk(D,) F issuchthatT(g)k(D,)=g (D). The functiong in (A.7) can be interpreted as capturing the interaction between4and the metadistributionπ.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found