Kernel Mean Shrinkage Estimators

arXiv.org Machine Learning

A mean function in a reproducing kernel Hilbert space (RKHS), or a kernel mean, is central to kernel methods in that it is used by many classical algorithms such as kernel principal component analysis, and it also forms the core inference step of modern kernel methods that rely on embedding probability distributions in RKHSs. Given a finite sample, an empirical average has been used commonly as a standard estimator of the true kernel mean. Despite a widespread use of this estimator, we show that it can be improved thanks to the well-known Stein phenomenon. We propose a new family of estimators called kernel mean shrinkage estimators (KMSEs), which benefit from both theoretical justifications and good empirical performance. The results demonstrate that the proposed estimators outperform the standard one, especially in a "large d, small n" paradigm.


Dictionary.com Chooses 'Complicit' as Its Word of the Year

U.S. News

That's when a house in The Hamptons where the episode was filmed went on the market. For the record: The Jason Alexander character George Costanza emerges with "shrinkage" from a pool and said "shrinkage" is noted by Jerry's girlfriend.


Kernel Mean Estimation via Spectral Filtering

arXiv.org Machine Learning

The problem of estimating the kernel mean in a reproducing kernel Hilbert space (RKHS) is central to kernel methods in that it is used by classical approaches (e.g., when centering a kernel PCA matrix), and it also forms the core inference step of modern kernel methods (e.g., kernel-based non-parametric tests) that rely on embedding probability distributions in RKHSs. Muandet et al. (2014) has shown that shrinkage can help in constructing "better" estimators of the kernel mean than the empirical estimator. The present paper studies the consistency and admissibility of the estimators in Muandet et al. (2014), and proposes a wider class of shrinkage estimators that improve upon the empirical estimator by considering appropriate basis functions. Using the kernel PCA basis, we show that some of these estimators can be constructed using spectral filtering algorithms which are shown to be consistent under some technical assumptions. Our theoretical analysis also reveals a fundamental connection to the kernel-based supervised learning framework. The proposed estimators are simple to implement and perform well in practice.


Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks

arXiv.org Machine Learning

We present a procedure for effective estimation of entropy and mutual information from small-sample data, and apply it to the problem of inferring high-dimensional gene association networks. Specifically, we develop a James-Stein-type shrinkage estimator, resulting in a procedure that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms eight other entropy estimation procedures across a diverse range of sampling scenarios and data-generating models, even in cases of severe undersampling. We illustrate the approach by analyzing E. coli gene expression data and computing an entropy-based gene-association network from gene expression data. A computer program is available that implements the proposed shrinkage estimator.


Multiple sclerosis drug is first to dramatically cut brain shrinkage

New Scientist

An experimental drug for the most severe forms of multiple sclerosis has slowed brain shrinkage by nearly a half. There are dozens of therapies approved for the relapsing form of MS, a disease of the nervous system, in which people can be symptom-free for months before another attack. But there are very few for people suffering from more severe forms of the disease – known as primary progressive and secondary progressive MS and in which there is rarely any respite from disabling symptoms.