Goto

Collaborating Authors

 m-estimator


Optimal Subsampling with Influence Functions

Daniel Ting, Eric Brochu

Neural Information Processing Systems

As the amount of data increases, the question arises as to how best to deal with the large datasets. While computational platforms such as Spark [28] and Ray [23] help process large datasets once a desired model is chosen, simply using smaller data can be a faster solution for exploratory data modeling, rapid prototyping, or other tasks where the accuracy obtainable from the full dataset is notneeded.




On model selection consistency of penalized M-estimators: a geometric theory

Neural Information Processing Systems

Penalized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure. Often, the penalties are \emph{geometrically decomposable}, \ie\ can be expressed as a sum of (convex) support functions. We generalize the notion of irrepresentable to geometrically decomposable penalties and develop a general framework for establishing consistency and model selection consistency of M-estimators with such penalties. We then use this framework to derive results for some special cases of interest in bioinformatics and statistical learning.


Statistical Inference with M-Estimators on Adaptively Collected Data

Neural Information Processing Systems

Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence intervals when used with data collected with bandit algorithms. Alternative methods have recently been developed for simple models (e.g., comparison of means). Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward. In this work, we develop theory justifying the use of M-estimators---which includes estimators based on empirical risk minimization as well as maximum likelihood---on data collected with adaptive algorithms, including (contextual) bandit algorithms. Specifically, we show that M-estimators, modified with particular adaptive weights, can be used to construct asymptotically valid confidence regions for a variety of inferential targets.


Frank-Wolfe-based Algorithms for Approximating Tyler's M-estimator

Neural Information Processing Systems

Tyler's M-estimator is a well known procedure for robust and heavy-tailed covariance estimation. Tyler himself suggested an iterative fixed-point algorithm for computing his estimator however, it requires super-linear (in the size of the data) runtime per iteration, which maybe prohibitive in large scale. In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler's estimator. One variant uses standard Frank-Wolfe steps, the second also considers \textit{away-steps} (AFW), and the third is a \textit{geodesic} version of AFW (GAFW). AFW provably requires, up to a log factor, only linear time per iteration, while GAFW runs in linear time (up to a log factor) in a large $n$ (number of data-points) regime. All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth. Under an additional fairly mild assumption, that holds with probability 1 when the (normalized) data-points are i.i.d.





Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions

Lau, Lap Chi, Ramachandran, Akshay

arXiv.org Artificial Intelligence

A fundamental problem in statistics is estimating the shape matrix of an Elliptical distribution. This generalizes the familiar problem of Gaussian covariance estimation, for which the sample covariance achieves optimal estimation error. For Elliptical distributions, Tyler proposed a natural M-estimator and showed strong statistical properties in the asymptotic regime, independent of the underlying distribution. Numerical experiments show that this estimator performs very well, and that Tyler's iterative procedure converges quickly to the estimator. Franks and Moitra recently provided the first distribution-free error bounds in the finite sample setting, as well as the first rigorous convergence analysis of Tyler's iterative procedure. However, their results exceed the sample complexity of the Gaussian setting by a $\log^{2} d$ factor. We close this gap by proving optimal sample threshold and error bounds for Tyler's M-estimator for all Elliptical distributions, fully matching the Gaussian result. Moreover, we recover the algorithmic convergence even at this lower sample threshold. Our approach builds on the operator scaling connection of Franks and Moitra by introducing a novel pseudorandom condition, which we call $\infty$-expansion. We show that Elliptical distributions satisfy $\infty$-expansion at the optimal sample threshold, and then prove a novel scaling result for inputs satisfying this condition.