10 Supplementary Material for the paper LeadCache Regret Optimal Caching in Networks by and

Apr-25-2026, 03:10:06 GMT–Neural Information Processing Systems

Following Cohen and Hazan [2015] we derive a general expression for the regret upper bound applicable to any linear reward function under an anytime FTPL policy. This is accomplished in the following steps. First, we extend the argument of Cohen and Hazan [2015] to the anytime setting. Then, we specialize this bound to our problem setting. Recall the notations used in the paper - the aggregate file-request sequence from all users is denoted by {xt}t 1 and the virtual cache configuration sequence is denoted by {zt}t 1. Define the cumulative requests up to time tas: Xt = Furthermore, since the max function 14 is convex, we may interchange the expectation and gradient to obtain Φηt(Xt) =E(zt) [Bertsekas, 1973, Proposition 2.2]. Plugging in the expression of the inner product from Eqn. (25) in expression (26), we obtain: Bounding the term (a): Next, to upper bound the expected regret, we control term (a) in inequality (28).

artificial intelligence, eqn, inequality, (16 more...)

Neural Information Processing Systems

Apr-25-2026, 03:10:06 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.69)

Duplicate Docs Excel Report

Title
10 Regret Optimal

Similar Docs Excel Report more

Title	Similarity	Source
None found