Multi-armed Bandit Requiring Monotone Arm Sequences Ningyuan Chen Rotman School of Management, University of Toronto 105 St George St, Toronto, ON, Canada ningyuan.chen@utoronto.ca
–Neural Information Processing Systems
In online learning problems such as the multi-armed bandits (MAB), the decision maker chooses from a set of actions/arms with unknown reward.
Neural Information Processing Systems
Nov-14-2025, 22:33:46 GMT
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > Canada
- Europe > United Kingdom
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education (0.87)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.97)
- Technology: