Time-uniform confidence bands for the CDF under nonstationarity
Mineiro, Paul, Howard, Steven R.
–arXiv.org Artificial Intelligence
What would have happened if I had acted differently? Although this question is as old as time itself, successful companies have recently embraced this question via counterfactual estimation of outcomes from the exhaust of their controlled experimentation platforms, e.g., based upon A/B testing or contextual bandits. These experiments are run in the real (digital) world, which is rich enough to demand statistical techniques that are non-asymptotic, non-parametric, and non-stationary. Although recent advances admit characterizing counterfactual average outcomes in this general setting, counterfactually estimating a complete distribution of outcomes is heretofore only possible with additional assumptions. Nonethless, the practical importance of this problem has motivated multiple solutions: see Table 1 for a summary, and Section 5 for complete discussion. Intriguingly, this problem is provably impossible in the data dependent setting without additional assumptions. Rakhlin et al. [2015] Consequently, our bounds always achieve non-asymptotic coverage, but may converge to zero width slowly or not at all, depending on the hardness of the instance. We call this design principle AVAST (Always Valid And Sometimes Trivial). In pursuit of our ultimate goal, we derive factual distribution estimators which are useful for estimating the complete distribution of outcomes from direct experience.
arXiv.org Artificial Intelligence
Feb-27-2023
- Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report > Experimental Study (0.67)
- Technology: