Collaborating Authors

Penalty, Shrinkage, and Preliminary Test Estimators under Full Model Hypothesis Machine Learning

This paper considers a multiple regression model and compares, under full model hypothesis, analytically as well as by simulation, the performance characteristics of some popular penalty estimators such as ridge regression, LASSO, adaptive LASSO, SCAD, and elastic net versus Least Squares Estimator, restricted estimator, preliminary test estimator, and Stein-type estimators when the dimension of the parameter space is smaller than the sample space dimension. We find that RR uniformly dominates LSE, RE, PTE, SE and PRSE while LASSO, aLASSO, SCAD, and EN uniformly dominates LSE only. Further, it is observed that neither penalty estimators nor Stein-type estimator dominate one another.

High Dimensional Structured Superposition Models

Neural Information Processing Systems

High dimensional superposition models characterize observations using parameters which can be written as a sum of multiple component parameters, each with its own structure, e.g., sum of low rank and sparse matrices. In this paper, we consider general superposition models which allow sum of any number of component parameters, and each component structure can be characterized by any norm. We present a simple estimator for such models, give a geometric condition under which the components can be accurately estimated, characterize sample complexity of the estimator, and give non-asymptotic bounds on the componentwise estimation error. We use tools from empirical processes and generic chaining for the statistical analysis, and our results, which substantially generalize prior work on superposition models, are in terms of Gaussian widths of suitable spherical caps. Papers published at the Neural Information Processing Systems Conference.

Closed-form Estimators for High-dimensional Generalized Linear Models

Neural Information Processing Systems

We propose a class of closed-form estimators for GLMs under high-dimensional sampling regimes. Our class of estimators is based on deriving closed-form variants of the vanilla unregularized MLE but which are (a) well-defined even under high-dimensional settings, and (b) available in closed-form. We then perform thresholding operations on this MLE variant to obtain our class of estimators. We derive a unified statistical analysis of our class of estimators, and show that it enjoys strong statistical guarantees in both parameter error as well as variable selection, that surprisingly match those of the more complex regularized GLM MLEs, even while our closed-form estimators are computationally much simpler. We derive instantiations of our class of closed-form estimators, as well as corollaries of our general theorem, for the special cases of logistic, exponential and Poisson regression models.

The Robustness of Estimator Composition

Neural Information Processing Systems

A composite estimator successively applies two (or more) estimators: on data decomposed into disjoint parts, it applies the first estimator on each part, then the second estimator on the outputs of the first estimator. And so on, if the composition is of more than two estimators. Informally, the breakdown point is the minimum fraction of data points which if significantly modified will also significantly modify the output of the estimator, so it is typically desirable to have a large breakdown point. Our main result shows that, under mild conditions on the individual estimators, the breakdown point of the composite estimator is the product of the breakdown points of the individual estimators. We also demonstrate several scenarios, ranging from regression to statistical testing, where this analysis is easy to apply, useful in understanding worst case robustness, and sheds powerful insights onto the associated data analysis.

f-divergence estimation and two-sample homogeneity test under semiparametric density-ratio models Machine Learning

A density ratio is defined by the ratio of two probability densities. We study the inference problem of density ratios and apply a semi-parametric density-ratio estimator to the two-sample homogeneity test. In the proposed test procedure, the f-divergence between two probability densities is estimated using a density-ratio estimator. The f-divergence estimator is then exploited for the two-sample homogeneity test. We derive the optimal estimator of f-divergence in the sense of the asymptotic variance, and then investigate the relation between the proposed test procedure and the existing score test based on empirical likelihood estimator. Through numerical studies, we illustrate the adequacy of the asymptotic theory for finite-sample inference.