Differentially Private Contextual Linear Bandits
–Neural Information Processing Systems
We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem.
Neural Information Processing Systems
Oct-7-2024, 22:11:26 GMT
- Country:
- North America
- Canada > Alberta (0.46)
- United States (1.00)
- North America
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: