Multi-armed Bandit Requiring Monotone Arm Sequences Ningyuan Chen Rotman School of Management, University of Toronto 105 St George St, Toronto, ON, Canada ningyuan.chen@utoronto.ca

Neural Information Processing Systems 

In online learning problems such as the multi-armed bandits (MAB), the decision maker chooses from a set of actions/arms with unknown reward.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found