LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation

Open in new window