Thompson Sampling on Symmetric $\alpha$-Stable Bandits
Dubey, Abhimanyu, Pentland, Alex
Rigorous empirical evidence in favor of TS demonstrated by Chapelle and Li [2011] sparked new interest in the theoretical Thompson Sampling provides an efficient technique analysis of the algorithm, and the seminal work to introduce prior knowledge in the multiarmed of Agrawal and Goyal [2012, 2013]; Russo and Van Roy bandit problem, along with providing remarkable [2014] demonstrated the optimality of TS when rewards are empirical performance. In this paper, bounded in [0, 1] or are Gaussian. These results were extended we revisit the Thompson Sampling algorithm under in the work of Korda et al. [2013] to more general, rewards drawn from symmetric α-stable distributions, exponential family reward distributions. The empirical studies, which are a class of heavy-tailed probability along with theoretical guarantees, have established TS as distributions utilized in finance and economics, a powerful algorithm for the MAB problem. in problems such as modeling stock prices and However, when designing decision-making algorithms for human behavior. We present an efficient framework complex systems, we see that interactions in such systems often for posterior inference, which leads to two lead to heavy-tailed and power law distributions, such as algorithms for Thompson Sampling in this setting.
Jul-8-2019
- Country:
- North America > United States > Massachusetts (0.14)
- Genre:
- Research Report (0.64)
- Industry:
- Banking & Finance (0.34)
- Technology: