Saxony-Anhalt
Best of both worlds: Stochastic & adversarial best-arm identification
Abbasi-Yadkori, Yasin, Bartlett, Peter L., Gabillon, Victor, Malek, Alan, Valko, Michal
We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A simple random uniform learner obtains the optimal rate of error in the adversarial scenario. However, this type of strategy is suboptimal when the rewards are sampled stochastically. Therefore, we ask: Can we design a learner that performs optimally in both the stochastic and adversarial problems while not being aware of the nature of the rewards? First, we show that designing such a learner is impossible in general. In particular, to be robust to adversarial rewards, we can only guarantee optimal rates of error on a subset of the stochastic problems. We give a lower bound that characterizes the optimal rate in stochastic problems if the strategy is constrained to be robust to adversarial rewards. Finally, we design a simple parameter-free algorithm and show that its probability of error matches (up to log factors) the lower bound in stochastic problems, and it is also robust to adversarial ones.
- Oceania > Australia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- North America > Canada (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Asia > Singapore (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Information Technology (0.93)
- North America > Canada (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.42)
- North America > United States (0.14)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.48)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- Europe > France (0.04)
- North America > United States (0.04)
- (4 more...)
- North America > United States (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- North America > United States (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.14)
- Europe > Romania (0.04)
- Europe > United Kingdom > England (0.04)
- (19 more...)
- Health & Medicine > Therapeutic Area > Endocrinology (1.00)
- Education (1.00)
- Banking & Finance (0.92)
- (3 more...)