POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning

Futoma, Joseph, Hughes, Michael C., Doshi-Velez, Finale

Jan-12-2020–arXiv.org Machine Learning

Many medical decision-making settings can be framed as partially observed Markov decision processes (POMDPs). However, popular two-stage approaches that first learn a POMDP model and then solve it often fail because the model that best fits the data may not be the best model for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in the kinds of batch, off-policy settings common in medicine. We demonstrate our approach on synthetic examples and a real-world hypotension management task.

likelihood, objective, popcorn, (16 more...)

arXiv.org Machine Learning

Jan-12-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Hampshire County
    - Amherst (0.04)
  - Illinois > Cook County
    - Chicago (0.04)
- Europe > Italy
  - Sicily > Palermo (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.69)
  - Immunology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found