607bc9ebe4abfcd65181bfbef6252830-AuthorFeedback.pdf

Neural Information Processing Systems 

In [8], the anti-16 concentration condition isacentral assumption for the analysis ofthe regret17 bound under thesub-Weibullperturbation. However,theheavy-tailed pertur-18 bation, including GEV and Gamma, does not satisfy the anti-concentration19 condition. Hence, wepropose anewframework(Assumption 2inthemain20 paper) which is a sufficient condition to ensure the bounded regret and gen-21 eralizes the anti-concentration condition. To the best of our knowledge, this22 is the first result of heavy-tailed perturbations in the stochastic MAB. Weremark that [6] analyzes the upper bound of thesimple regret,which focuses on finding28 the optimal action afterT rounds, so it does not tell how much rewards will be lost during the exploration. This trick itself appears in [1, 8].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found