Differentially Private Contextual Linear Bandits

Feb-14-2020, 14:27:44 GMT–Neural Information Processing Systems

We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem. So instead, we adopt the notion of joint differential privacy, where we assume that the action chosen on day t is only revealed to user t and thus needn't be kept private that day, only on following days. We give a general scheme converting the classic linear-UCB algorithm into a joint differentially private algorithm using the tree-based algorithm.

algorithm, artificial intelligence, big data, (6 more...)

Neural Information Processing Systems

Feb-14-2020, 14:27:44 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (0.86)
  - Data Science > Data Mining
    - Big Data (0.87)