From Finite to Countable-Armed Bandits
–Neural Information Processing Systems
We consider a stochastic bandit problem with countably many arms that belong to a finite set of types, each characterized by a unique mean reward. In addition, there is a fixed distribution over types which sets the proportion of each type in the population of arms. The decision maker is oblivious to the type of any arm and to the aforementioned distribution over types, but perfectly knows the total number of types occurring in the population of arms.
Neural Information Processing Systems
Nov-14-2025, 02:38:45 GMT
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Canada (0.04)
- United States (0.04)
- Europe > United Kingdom
- Technology: