From Finite to Countable-Armed Bandits

Neural Information Processing Systems 

We consider a stochastic bandit problem with countably many arms that belong to a finite set of types, each characterized by a unique mean reward. In addition, there is a fixed distribution over types which sets the proportion of each type in the population of arms. The decision maker is oblivious to the type of any arm and to the aforementioned distribution over types, but perfectly knows the total number of types occurring in the population of arms.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found