Goto

Collaborating Authors

 support estimation


Navigating Sparsities in High-Dimensional Linear Contextual Bandits

arXiv.org Artificial Intelligence

High-dimensional linear contextual bandit problems remain a significant challenge due to the curse of dimensionality. Existing methods typically consider either the model parameters to be sparse or the eigenvalues of context covariance matrices to be (approximately) sparse, lacking general applicability due to the rigidity of conventional reward estimators. To overcome this limitation, a powerful pointwise estimator is introduced in this work that adaptively navigates both kinds of sparsity. Based on this pointwise estimator, a novel algorithm, termed HOPE, is proposed. Theoretical analyses demonstrate that HOPE not only achieves improved regret bounds in previously discussed homogeneous settings (i.e., considering only one type of sparsity) but also, for the first time, efficiently handles two new challenging heterogeneous settings (i.e., considering a mixture of two types of sparsity), highlighting its flexibility and generality. Experiments corroborate the superiority of HOPE over existing methods across various scenarios.


A signal separation view of classification

arXiv.org Machine Learning

The problem of classification in machine learning has often been approached in terms of function approximation. In this paper, we propose an alternative approach for classification in arbitrary compact metric spaces which, in theory, yields both the number of classes, and a perfect classification using a minimal number of queried labels. Our approach uses localized trigonometric polynomial kernels initially developed for the point source signal separation problem in signal processing. Rather than point sources, we argue that the various classes come from different probability distributions. The localized kernel technique developed for separating point sources is then shown to separate the supports of these distributions. This is done in a hierarchical manner in our MASC algorithm to accommodate touching/overlapping class boundaries. We illustrate our theory on several simulated and real life datasets, including the Salinas and Indian Pines hyperspectral datasets and a document dataset.


On the Sample Complexity of Subspace Learning

Neural Information Processing Systems

A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribution, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.


Spectral Regularization for Support Estimation

Neural Information Processing Systems

In this paper we consider the problem of learning from data the support of a probability distribution when the distribution {\em does not} have a density (with respect to some reference measure). We propose a new class of regularized spectral estimators based on a new notion of reproducing kernel Hilbert space, which we call {\em completely regular''}. Completely regular kernels allow to capture the relevant geometric and topological properties of an arbitrary probability space. In particular, they are the key ingredient to prove the universal consistency of the spectral estimators and in this respect they are the analogue of universal kernels for supervised problems. Numerical experiments show that spectral estimators compare favorably to state of the art machine learning algorithms for density support estimation.


Support-weighted Adversarial Imitation Learning

arXiv.org Machine Learning

Adversarial Imitation Learning (AIL) is a broad family of imitation learning methods designed to mimic expert behaviors from demonstrations. While AIL has shown state-of-the-art performance on imitation learning with only small number of demonstrations, it faces several practical challenges such as potential training instability and implicit reward bias. To address the challenges, we propose Support-weighted Adversarial Imitation Learning (SAIL), a general framework that extends a given AIL algorithm with information derived from support estimation of the expert policies. SAIL improves the quality of the reinforcement signals by weighing the adversarial reward with a confidence score from support estimation of the expert policy. We also show that SAIL is always at least as efficient as the underlying AIL algorithm that SAIL uses for learning the adversarial reward. Empirically, we show that the proposed method achieves better performance and training stability than baseline methods on a wide range of benchmark control tasks.


Spectral Regularization for Support Estimation

Neural Information Processing Systems

In this paper we consider the problem of learning from data the support of a probability distribution when the distribution {\em does not} have a density (with respect to some reference measure). We propose a new class of regularized spectral estimators based on a new notion of reproducing kernel Hilbert space, which we call {\em completely regular''}. Completely regular kernels allow to capture the relevant geometric and topological properties of an arbitrary probability space. In particular, they are the key ingredient to prove the universal consistency of the spectral estimators and in this respect they are the analogue of universal kernels for supervised problems. Numerical experiments show that spectral estimators compare favorably to state of the art machine learning algorithms for density support estimation.


Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation

arXiv.org Machine Learning

We consider a specific setting of imitation learning - the task of policy learning from expert demonstrations - in which the learner only has a finite number of expert trajectories without any further access to the expert. Two broad categories of approaches to this settings are behavioral cloning (BC) Pomerleau (1991), which directly learns a policy mapping from states to actions with supervised learning from expert trajectories; and inverse reinforcement learning (IRL) Ng & Russell (2000); Abbeel & Ng (2004), which learns a policy via reinforcement learning, using a cost function extracted from expert trajectories. Most notably, BC has been successfully applied to the task of autonomous driving Bojarski et al. (2016); Bansal et al. (2018). Despite its simplicity, BC typically requires a large amount of training data to learn good policies, as it may suffer from compounding errors caused by covariate shift Ross & Bagnell (2010); Ross et al. (2011). BC is often used as a policy initialization step for further reinforcement learning Nagabandi et al. (2018); Rajeswaran et al. (2017). IRL estimates a cost function from expert trajectories and uses reinforcement learning to derive policies. As the cost function evaluates the quality of trajectories rather than that of individual actions, IRL avoids the problem of compounding errors. IRL is effective with a wide range of problems, from continuous control benchmarks in the Mujoco environment Ho & Ermon (2016), to robot footsteps planning Ziebart et al. (2008). Generative Adversarial Imitation Learning (GAIL) Ho & Ermon (2016); Baram et al. (2017) connects IRL to the general framework of Generative Adversarial Networks (GANs) Goodfellow et al.


Support Estimation via Regularized and Weighted Chebyshev Approximations

arXiv.org Machine Learning

We introduce a new framework for estimating the support size of an unknown distribution which improves upon known approximation-based techniques. Our main contributions include describing a rigorous new weighted Chebyshev polynomial approximation method and introducing regularization terms into the problem formulation that provably improve the performance of state-of-the-art approximation-based approaches. In particular, we present both theoretical and computer simulation results that illustrate the utility and performance improvements of our method. The theoretical analysis relies on jointly optimizing the bias and variance components of the risk, and combining new weighted minmax polynomial approximation techniques with discretized semi-infinite programming solvers. Such a setting allows for casting the estimation problem as a linear program (LP) with a small number of variables and constraints that may be solved as efficiently as the original Chebyshev approximation-based problem. The described approach also applies to the support coverage and entropy estimation problems. Our newly developed technique is tested on synthetic data and used to estimate the number of bacterial species in the human gut. On synthetic datasets, we observed up to five-fold improvements in the value of the worst-case risk. For the bioinformatics application, metagenomic data from the NIH Human Gut and the American Gut Microbiome was combined and processed to obtain lists of bacterial taxonomies. These were subsequently used to compute the bacterial species histograms and estimate the number of bacterial species in the human gut to roughly 2350, with the species being represented by trillions of cells.


On the Sample Complexity of Subspace Learning

Neural Information Processing Systems

A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribution, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.