Finite-TimeAnalysisofRound-Robin Kullback-LeiblerUpperConfidenceBoundsfor OptimalAdaptiveAllocationwithMultiplePlaysand MarkovianRewards
–Neural Information Processing Systems
Forouranalysis wedevise several concentration results forMarkovchains, including amaximal inequality for Markov chains, that may be of interest in their own right. As a byproduct of our analysis we also establish asymptotically optimal, finite-time guarantees for the case of multiple plays, and i.i.d.
Neural Information Processing Systems
Feb-8-2026, 12:46:35 GMT
- Country:
- Europe
- Hungary > Budapest
- Budapest (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Hungary > Budapest
- North America
- Canada > British Columbia
- United States
- California
- Alameda County > Hayward (0.04)
- Santa Clara County > Palo Alto (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- California
- Europe