Linear Contextual Bandits with Knapsacks
Shipra Agrawal, Nikhil Devanur
–Neural Information Processing Systems
In each round, the outcome of pulling an arm is a reward as well as a vector of resource consumptions. The expected values of these outcomes depend linearly on the context of that arm.
Neural Information Processing Systems
Nov-21-2025, 10:48:12 GMT