NonstochasticMultiarmedBandits withUnrestrictedDelays

Neural Information Processing Systems 

Wefirstprovethat"delayed"Exp3achievesthe O p (KT +D)lnK regret bound conjectured by Cesa-Bianchi et al. [2019] in the case of variable, but bounded delays. Here,K is the number of actions andD isthe total delay overT rounds.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found