Near-Optimal MNL Bandits Under Risk Criteria
Xi, Guangyu, Tao, Chao, Zhou, Yuan
We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and bussiness. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.
Sep-25-2020
- Country:
- North America > United States
- Illinois (0.04)
- Indiana (0.04)
- Maryland > Prince George's County
- College Park (0.04)
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: