Differentially Private Contextual Linear Bandits

Roshan Shariff, Or Sheffet

Neural Information Processing Systems 

We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found