Off-policy Evaluation in Doubly Inhomogeneous Environments

Bian, Zeyu, Shi, Chengchun, Qi, Zhengling, Wang, Lan

Sep-7-2023–arXiv.org Artificial Intelligence

Reinforcement learning (RL, Sutton and Barto, 2018) aims to optimize an agent's long-term reward by learning an optimal policy that determines the best action to take under every circumstance. RL is closely related to the dynamic treatment regimens (DTR) or adaptive treatment strategies in statistical research for precision medicine (Murphy, 2003; Robins, 2004; Qian and Murphy, 2011; Kosorok and Moodie, 2015; Tsiatis et al., 2019; Qi et al., 2020; Zhou et al., 2022a), which seeks to obtain the optimal treatment policy in finite horizon settings with a few treatment stages that maximizes patients' expected outcome. Nevertheless, statistical methods for DTR mentioned above normally cannot handle large or infinite horizon settings. They require the number of trajectories to tend to infinity to achieve estimation consistency, unlike RL, which works even with finite number of trajectories under certain conditions. In addition to precision medicine, RL has been applied to various fields, such as games (Silver et al., 2016), ridesharing (Xu et al., 2018), mobile health (Liao et al., 2021) and robotics (Levine et al., 2020).

assumption, estimator, q-function, (16 more...)

arXiv.org Artificial Intelligence

Sep-7-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Israel (0.04)
  - Jordan (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- North America > United States
  - Florida > Palm Beach County
    - Boca Raton (0.04)
  - Massachusetts > Suffolk County
    - Boston (0.04)
  - New York (0.04)
  - Pennsylvania > Philadelphia County
    - Philadelphia (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area (0.93)
- Transportation > Ground
  - Road (0.34)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Reinforcement Learning (1.00)
      - Statistical Learning (1.00)
    - Representation & Reasoning (1.00)
  - Data Science (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found