Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards
Lee, Kyungjae, Yang, Hongjun, Lim, Sungbin, Oh, Songhwai
In this paper, we consider stochastic multi-armed bandits (MABs) with heavy-tailed rewards, whose $p$-th moment is bounded by a constant $\nu_{p}$ for $1
Oct-24-2020