Supplementary Material for BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning A Proofs of Theorems

Open in new window