Local Differential Privacy for Regret Minimization in Reinforcement Learning

Oct-10-2024, 12:35:37 GMT–Neural Information Processing Systems

Reinforcement learning algorithms are widely used in domains where it is desirable to provide a personalized service. In these domains it is common that user data contains sensitive information that needs to be protected from third parties. Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side. We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We establish a lower bound for regret minimization in finite-horizon MDPs with LDP guarantees which shows that guaranteeing privacy has a multiplicative effect on the regret.

local differential privacy, regret minimization, reinforcement learning, (3 more...)

Neural Information Processing Systems

Oct-10-2024, 12:35:37 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.70)