Review for NeurIPS paper: A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Open in new window