Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits Muhammad Faaiz Taufiq
–Neural Information Processing Systems
Propensity Score (MIPS) estimator, proving that MR achieves lower variance among a generalized family of MIPS estimators.
Neural Information Processing Systems
Feb-16-2026, 07:50:49 GMT
- Country:
- Europe > United Kingdom
- England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- England
- North America > United States (0.14)
- Europe > United Kingdom
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Health & Medicine (0.92)
- Technology: