Inverse Transition Learning: Learning Dynamics from Demonstrations
Benac, Leo, Sharma, Abhishek, Parbhoo, Sonali, Doshi-Velez, Finale
We consider the problem of estimating the transition dynamics $T^*$ from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a \emph{feature}: we use the fact that the expert is near-optimal to inform our estimate of $T^*$. We integrate our constraints into a Bayesian approach. Across both synthetic environments and real healthcare scenarios like Intensive Care Unit (ICU) patient management in hypotension, we demonstrate not only significant improvements in decision-making, but that our posterior can inform when transfer will be successful.
Nov-7-2024
- Country:
- Asia > Middle East
- Israel (0.04)
- North America > United States
- Illinois > Cook County > Chicago (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.67)
- Industry: