Inverse Transition Learning: Learning Dynamics from Demonstrations

Benac, Leo, Sharma, Abhishek, Parbhoo, Sonali, Doshi-Velez, Finale

Nov-7-2024–arXiv.org Machine Learning

We consider the problem of estimating the transition dynamics $T^*$ from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a \emph{feature}: we use the fact that the expert is near-optimal to inform our estimate of $T^*$. We integrate our constraints into a Bayesian approach. Across both synthetic environments and real healthcare scenarios like Intensive Care Unit (ICU) patient management in hypotension, we demonstrate not only significant improvements in decision-making, but that our posterior can inform when transfer will be successful.

artificial intelligence, constraint, machine learning, (15 more...)

arXiv.org Machine Learning

Nov-7-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois > Cook County > Chicago (0.04)
- Asia > Middle East
  - Israel (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Health & Medicine > Health Care Providers & Services (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (1.00)