difference-in-means estimator
- North America > United States > California > San Mateo County > Menlo Park (0.05)
- North America > Mexico > Oaxaca (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Machine Learning for Variance Reduction in Online Experiments
We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments, the estimator has over $70\%$ lower variance than the simple difference-in-means estimator, and about $19\%$ lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.
- Research Report > Strength High (0.98)
- Research Report > Experimental Study (0.98)
- North America > United States > California > San Mateo County > Menlo Park (0.05)
- North America > Mexico > Oaxaca (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Practical Improvements of A/B Testing with Off-Policy Estimation
Sakhi, Otmane, Gilotte, Alexandre, Rohde, David
We address the problem of A/B testing, a widely used protocol for evaluating the potential improvement achieved by a new decision system compared to a baseline. This protocol segments the population into two subgroups, each exposed to a version of the system and estimates the improvement as the difference between the measured effects. In this work, we demonstrate that the commonly used difference-in-means estimator, while unbiased, can be improved. We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach. Among this family, we identify the estimator with the lowest variance. The resulting estimator is simple, and offers substantial variance reduction when the two tested systems exhibit similarities. Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
Machine Learning for Variance Reduction in Online Experiments
We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large.
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Improving experiment precision with machine learning - Meta Research
Experimentation is a central part of data-driven product development, yet in practice the results from experiments may be too imprecise to be of much help in improving decision-making. One possible response is to reduce statistical noise by simply running larger experiments. However, this is not always desirable, or even feasible. This raises the question of how we can make better use of the data we have and get sharper, more precise experimental estimates without having to enroll more people in the test. In a collaboration between Meta's Core Data Science and Experimentation Platform teams, we developed a new methodology for making progress on this problem, which both has formal statistical guarantees and is scalable enough to implement in practice.
Machine Learning for Variance Reduction in Online Experiments
Guo, Yongyi, Coey, Dominic, Konutgan, Mikael, Li, Wenting, Schoener, Chris, Goldman, Matt
We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70% lower variance than the simple difference-in-means estimator, and about 19% lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- North America > Mexico > Oaxaca (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)