7a62d9a4c03377d1175b8859b4cc16d4-Supplemental-Conference.pdf

Neural Information Processing Systems 

The original bandits with knapsacks problem assumes than when this happens, the process of learning and gaining rewards ceases. The key distinction between that model and ours is that we instead assume the learner is allowed remain idle until the supply of every resource becomes positive again, at which point the learning process recommences.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found