Linear Contextual Bandits with Knapsacks

Shipra Agrawal, Nikhil Devanur

Neural Information Processing Systems 

In each round, the outcome of pulling an arm is a reward as well as a vector of resource consumptions. The expected values of these outcomes depend linearly on the context of that arm.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found