Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping

Open in new window