Note that the regret ofthe algorithm in [1]satisfiesR(G,T) = O (δ(G)logn)
–Neural Information Processing Systems
The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graphG = (V,E) where V is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed.
Neural Information Processing Systems
Feb-11-2026, 06:07:07 GMT
- Country:
- Asia > China (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Technology: