Best-Arm Identification in Linear Bandits
–Neural Information Processing Systems
We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, as a by-product of our analysis, we point out the connection to the G-optimality criterion used in optimal experimental design.
Neural Information Processing Systems
Mar-13-2024, 14:01:36 GMT
- Country:
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
- Genre:
- Research Report (0.67)
- Technology: