AMaximum-Entropy Approachto Off-Policy Evaluationin Average-Reward MDPs