Review for NeurIPS paper: Causal Imitation Learning With Unobserved Confounders

Neural Information Processing Systems 

The paper consideras a very general setting with possible unobserved confounders, expert and policy can have different inputs and the reward being unobserved. The work presents multiple criteria for ensuring successful imitation in particular based on proxy variables for task rewards.