Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching

Open in new window