Thompson Sampling on Symmetric $\alpha$-Stable Bandits

Jul-8-2019–arXiv.org Machine Learning

Rigorous empirical evidence in favor of TS demonstrated by Chapelle and Li [2011] sparked new interest in the theoretical Thompson Sampling provides an efficient technique analysis of the algorithm, and the seminal work to introduce prior knowledge in the multiarmed of Agrawal and Goyal [2012, 2013]; Russo and Van Roy bandit problem, along with providing remarkable [2014] demonstrated the optimality of TS when rewards are empirical performance. In this paper, bounded in [0, 1] or are Gaussian. These results were extended we revisit the Thompson Sampling algorithm under in the work of Korda et al. [2013] to more general, rewards drawn from symmetric α-stable distributions, exponential family reward distributions. The empirical studies, which are a class of heavy-tailed probability along with theoretical guarantees, have established TS as distributions utilized in finance and economics, a powerful algorithm for the MAB problem. in problems such as modeling stock prices and However, when designing decision-making algorithms for human behavior. We present an efficient framework complex systems, we see that interactions in such systems often for posterior inference, which leads to two lead to heavy-tailed and power law distributions, such as algorithms for Thompson Sampling in this setting.

algorithm, banking & finance, big data, (20 more...)

arXiv.org Machine Learning

Jul-8-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.14)

Genre:
- Research Report (0.64)

Industry:
- Banking & Finance (0.34)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Uncertainty (0.46)
  - Data Science > Data Mining
    - Big Data (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found