Goto

Collaborating Authors

 gaussian complexity




10 Supplementary Material for the paper LeadCache Regret Optimal Caching in Networks by and

Neural Information Processing Systems

Following Cohen and Hazan [2015] we derive a general expression for the regret upper bound applicable to any linear reward function under an anytime FTPL policy. This is accomplished in the following steps. First, we extend the argument of Cohen and Hazan [2015] to the anytime setting. Then, we specialize this bound to our problem setting. Recall the notations used in the paper - the aggregate file-request sequence from all users is denoted by {xt}t 1 and the virtual cache configuration sequence is denoted by {zt}t 1. Define the cumulative requests up to time tas: Xt = Furthermore, since the max function 14 is convex, we may interchange the expectation and gradient to obtain ฮฆฮทt(Xt) =E(zt) [Bertsekas, 1973, Proposition 2.2]. Plugging in the expression of the inner product from Eqn. (25) in expression (26), we obtain: Bounding the term (a): Next, to upper bound the expected regret, we control term (a) in inequality (28).





A Proof of Theorem 1 Proof

Neural Information Processing Systems

Theorem 6 is stated in terms of Gaussian complexity. Ben-David (2014) has a full proof. M (ฮฑ)null is the linear class following the depth-K neural network. The second term relies on the Lipschitz constant of DNN, which we bound with the following lemma. Similar results are given by Scaman and Virmaux (2018); Fazlyab et al. (2019).