Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

Cundy, Chris, Desai, Rishi, Ermon, Stefano

Apr-16-2024–arXiv.org Artificial Intelligence

As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We give examples of how this setting covers real-world problems in privacy for sequential decision-making. We solve this problem in the policy gradients framework by introducing a regularizer based on the mutual information (MI) between the sensitive state and the actions. We develop a model-based stochastic gradient estimator for optimization of privacy-constrained policies. We also discuss an alternative MI regularizer that serves as an upper bound to our main MI regularizer and can be optimized in a model-free setting, and a powerful direct estimator that can be used in an environment with differentiable dynamics. We contrast previous work in differentially-private RL to our mutual-information formulation of information disclosure. Experimental results show that our training method results in policies that hide the sensitive state, even in challenging high-dimensional tasks.

constraint, information, mutual information, (15 more...)

arXiv.org Artificial Intelligence

Apr-16-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America > United States
  - District of Columbia > Washington (0.04)
  - Arizona (0.04)
  - New York > New York County
    - New York City (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe
  - Spain (0.04)
  - Finland > Uusimaa
    - Helsinki (0.04)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.68)
  - Machine Learning
    - Reinforcement Learning (0.87)
    - Neural Networks (0.67)
    - Statistical Learning > Gradient Descent (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found