Privacy Preserving Off-Policy Evaluation

Xie, Tengyang, Thomas, Philip S., Miklau, Gerome

Jan-31-2019–arXiv.org Machine Learning

Many proposed applications of reinforcement learning (RL) involve the use of data that could contain sensitive information. For example, Raghu et al. [2017] proposed an application of RL and off-policy evaluation methods that uses peoples' medical records, and Theocharous et al. [2015] applied off-policy evaluation methods to user data collected by a bank in order to improve the targeting of advertisements. In examples like these, the data used by the RL systems is sensitive, and one should ensure that the methods applied to the data do not leak any sensitive information. Recently, Balle et al. [2016] showed how techniques from differential privacy can be used to ensure that (with high probability) policy evaluation methods for RL do not leak (much) sensitive information. In this paper we extend their work in two ways.

algorithm, step size, trajectory, (17 more...)

arXiv.org Machine Learning

Jan-31-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.48)
  - Immunology (0.47)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning
      - Reinforcement Learning (1.00)
      - Statistical Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found