Not enough data to create a plot.
Try a different view from the menu above.
Roshan Shariff
Differentially Private Contextual Linear Bandits
Roshan Shariff, Or Sheffet
Differentially Private Contextual Linear Bandits
Roshan Shariff, Or Sheffet
We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem.