Goto

Collaborating Authors

 central limit theorem


YEAST: Yet Another Sequential Test

Neural Information Processing Systems

The online evaluation of machine learning models is typically conducted through A/B experiments. Sequential statistical tests are valuable tools for analysing these experiments, as they enable researchers to stop data collection early without increasing the risk of false discoveries. However, existing sequential tests either limit the number of interim analyses or suffer from low statistical power. In this paper, we introduce a novel sequential test designed for the continuous monitoring of A/B experiments. We validate our method using semi-synthetic simulations and demonstrate that it outperforms current state-of-the-art sequential testing approaches. Our method is derived using a new technique that "inverts" a bound on the probability of threshold crossing, based on a classical maximal inequality.


Statistical Inference for Gradient Boosting Regression

Neural Information Processing Systems

Gradient boosting is widely popular due to its flexibility and predictive accuracy. However, statistical inference and uncertainty quantification for gradient boosting remain challenging and under-explored. We propose a unified framework for statistical inference in gradient boosting regression. Our framework integrates dropout or parallel training with a recently proposed regularization procedure called Boulevard that allows for a central limit theorem (CLT) for boosting. With these enhancements, we surprisingly find that increasing the dropout rate and the number of trees grown in parallel at each iteration substantially enhances signal recovery and overall performance. Our resulting algorithms enjoy similar CLTs, which we use to construct built-in confidence intervals, prediction intervals, and rigorous hypothesis tests for assessing variable importance in only O(nd2) time with the Nystr om method. Numerical experiments verify the asymptotic normality and demonstrate that our algorithms perform well, do not require early stopping, interpolate between regularized boosting and random forests, and confirm the validity of their built-in statistical inference procedures.


Sub-Gaussian Concentration and Entropic Normality of the Maximum Likelihood Estimator

arXiv.org Machine Learning

It is well known that, under standard regularity conditions, the maximum likelihood estimator (MLE) satisfies a central limit theorem and converges in distribution to a Gaussian random variable as the sample size grows. This paper strengthens this classical result by developing several stronger forms of asymptotic normality for the normalized MLE. With additional assumptions on the score, we first establish sub-Gaussian tail bounds and convergence of all moments for the normalized estimation error. We then prove an entropic central limit theorem for a smoothed version of the estimator, showing convergence in relative entropy to the limiting Gaussian law. When the Fisher information of the normalized estimate is bounded, or its density has bounded first derivative, we further show that the smoothing can be removed, yielding entropic normality of the MLE itself. The proofs develop auxiliary tools that may be of independent interest, including exponential consistency bounds, high-moment estimates, and entropy-control arguments for the estimator.





c9e1074f5b3f9fc8ea15d152add07294-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their useful feedback and their time. We have corrected all the minor comments, as suggested. We now provide specific answers to each reviewer below. We thank the reviewer for their positive evaluation of our work and their comments. "All the theoretical contributions seem to me a bit marginal" Since the Sliced-Wasserstein distance is an average of one-6 We will explain these observations more explicitly to clarify our contributions.