OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

Open in new window