Replicating Complex Dialogue Policy of Humans via Offline Imitation Learning with Supervised Regularization

Open in new window