OnlineMulti-ArmedBanditswithAdaptiveInference

Neural Information Processing Systems 

During online decision making in multi-armed bandits, one needs to conduct inference on the true mean reward of each arm based on data collected so far at each step.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found