Collaborating Authors

Locally private nonparametric confidence intervals and sequences Machine Learning

This work derives methods for performing nonparametric, nonasymptotic statistical inference for population parameters under the constraint of local differential privacy (LDP). Given observations $(X_1, \dots, X_n)$ with mean $\mu^\star$ that are privatized into $(Z_1, \dots, Z_n)$, we introduce confidence intervals (CI) and time-uniform confidence sequences (CS) for $\mu^\star \in \mathbb R$ when only given access to the privatized data. We introduce a nonparametric and sequentially interactive generalization of Warner's famous "randomized response" mechanism, satisfying LDP for arbitrary bounded random variables, and then provide CIs and CSs for their means given access to the resulting privatized observations. We extend these CSs to capture time-varying (non-stationary) means, and conclude by illustrating how these methods can be used to conduct private online A/B tests.

Evaluating the Performance of Reinforcement Learning Algorithms Machine Learning

Performance evaluations are critical for quantifying algorithmic advances in reinforcement learning. Recent reproducibility analyses have shown that reported performance results are often inconsistent and difficult to replicate. In this work, we argue that the inconsistency of performance stems from the use of flawed evaluation metrics. Taking a step towards ensuring that reported results are consistent, we propose a new comprehensive evaluation methodology for reinforcement learning algorithms that produces reliable measurements of performance both on a single environment and when aggregated across environments. We demonstrate this method by evaluating a broad class of reinforcement learning algorithms on standard benchmark tasks.



The Analytics Pane might be quite daunting, or perhaps you didn't even know it existed - either way, join us as we unpack the Confidence Intervals, Boxplots, Quartiles and distribution bands in this 3 part series/ This first episode we look at the constant line, average line, median with quartiles, 95% confidence intervals in addition to the custom analytics available.

Black-box Confidence Intervals: Excel and Perl Implementation


Check original article for most recent updates. Confidence interval is abbreviated as CI. In this new article (part of our series on robust techniques for automated data science) we describe an implementation both in Excel and Perl, and discussion about our popular model-free confidence interval technique introduced in our original Analyticbridge article. This is part of our series on data science techniques suitable for automation, usebla by non-experts. The next one to be detailed (with source code) will be our Hidden Decision Trees. Figure 1 is based on simulated data that does not follow a normal distribution: see section 2 and Figure 2 in this article.

In Defense of the Indefensible: A Very Naive Approach to High-Dimensional Inference Machine Learning

In recent years, a great deal of interest has focused on conducting inference on the parameters in a linear model in the high-dimensional setting. In this paper, we consider a simple and very na\"{i}ve two-step procedure for this task, in which we (i) fit a lasso model in order to obtain a subset of the variables; and (ii) fit a least squares model on the lasso-selected set. Conventional statistical wisdom tells us that we cannot make use of the standard statistical inference tools for the resulting least squares model (such as confidence intervals and $p$-values), since we peeked at the data twice: once in running the lasso, and again in fitting the least squares model. However, in this paper, we show that under a certain set of assumptions, with high probability, the set of variables selected by the lasso is deterministic. Consequently, the na\"{i}ve two-step approach can yield confidence intervals that have asymptotically correct coverage, as well as p-values with proper Type-I error control. Furthermore, this two-step approach unifies two existing camps of work on high-dimensional inference: one camp has focused on inference based on a sub-model selected by the lasso, and the other has focused on inference using a debiased version of the lasso estimator.