d4dd111a4fd973394238aca5c05bebe3-Supplemental.pdf
–Neural Information Processing Systems
L2yα2k 8γk E[k hkfk2] (49) 18 where (d) uses the fact that hkf = E[hkf|F0k]; (e) follows from Lemma 4; and (f) uses the Young's inequalitysuchthatab 2γka2+ b Similar to Section 2, we again evaluate F(x) on a certain vectory in place ofy (x), which is denoted as f(x,y) = h(x) f(y). Note that the Lipschitz continuity of g, 2g in Assumption 1 can be implied by the Lipschitz continuityofh, hintheaboveassumption. Assumption 11 is the counterpart of Assumption 10 that is made for the stationary distribution µθ(a|s). Suppose Assumption 10 and 12 hold. From Assumption 10 and 11,µθ(s),πθ(a|s), θµθ(s), θπθ(a|s) are Lipschitz continuous and bounded.
Neural Information Processing Systems
Feb-11-2026, 08:16:42 GMT