Goto

Collaborating Authors

 density ridge



Optimal Ridge Detection using Coverage Risk

Neural Information Processing Systems

We introduce the concept of coverage risk as an error measure for density ridge estimation. The coverage risk generalizes the mean integrated square error to set estimation. We propose two risk estimators for the coverage risk and we show that we can select tuning parameters by minimizing the estimated risk. We study the rate of convergence for coverage risk and prove consistency of the risk estimators. We apply our method to three simulated datasets and to cosmology data. In all the examples, the proposed method successfully recover the underlying density structure.


Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data

Zhang, Yikun, Chen, Yen-Chi

arXiv.org Machine Learning

This paper studies linear convergence of the subspace constrained mean shift (SCMS) algorithm, a well-known algorithm for identifying a density ridge defined by a kernel density estimator. By arguing that the SCMS algorithm is a special variant of a subspace constrained gradient ascent (SCGA) algorithm with an adaptive step size, we derive linear convergence of such SCGA algorithm. While the existing research focuses mainly on density ridges in the Euclidean space, we generalize density ridges and the SCMS algorithm to directional data. In particular, we establish the stability theorem of density ridges with directional data and prove the linear convergence of our proposed directional SCMS algorithm.


Normal-bundle Bootstrap

Zhang, Ruda, Ghanem, Roger

arXiv.org Machine Learning

Such a phenomenon is summed up in the manifold distribution hypothesis, and can be exploited in probabilistic learning. Here we present normal-bundle bootstrap (NBB), a method that generates new data which preserve the geometric structure of a given data set. Inspired by algorithms for manifold learning and concepts in differential geometry, our method decomposes the underlying probability measure into a marginalized measure on a learned data manifold and conditional measures on the normal spaces. The algorithm estimates the data manifold as a density ridge, and constructs new data by bootstrapping projection vectors and adding them to the ridge. We apply our method to the inference of density ridge and related statistics, and data augmentation to reduce overfitting.


Manifold unwrapping using density ridges

Myhre, Jonas Nordhaug, Shaker, Matineh, Kaba, Devrim, Jenssen, Robert, Erdogmus, Deniz

arXiv.org Machine Learning

Research on manifold learning within a density ridge estimation framework has shown great potential in recent work for both estimation and de-noising of manifolds, building on the intuitive and well-defined notion of principal curves and surfaces. However, the problem of unwrapping or unfolding manifolds has received relatively little attention within the density ridge approach, despite being an integral part of manifold learning in general. This paper proposes two novel algorithms for unwrapping manifolds based on estimated principal curves and surfaces for one- and multi-dimensional manifolds respectively. The methods of unwrapping are founded in the realization that both principal curves and principal surfaces will have inherent local maxima of the probability density function. Following this observation, coordinate systems that follow the shape of the manifold can be computed by following the integral curves of the gradient flow of a kernel density estimate on the manifold. Furthermore, since integral curves of the gradient flow of a kernel density estimate is inherently local, we propose to stitch together local coordinate systems using parallel transport along the manifold. We provide numerical experiments on both real and synthetic data that illustrates clear and intuitive unwrapping results comparable to state-of-the-art manifold learning algorithms.


Nonparametric modal regression

Chen, Yen-Chi, Genovese, Christopher R., Tibshirani, Ryan J., Wasserman, Larry

arXiv.org Machine Learning

Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter is used to select the smoothing bandwidth of the underlying KDE. The idea behind modal regression is connected to many others, such as mixture regression and density ridge estimation, and we discuss these ties as well.


Optimal Ridge Detection using Coverage Risk

Chen, Yen-Chi, Genovese, Christopher R., Ho, Shirley, Wasserman, Larry

Neural Information Processing Systems

We introduce the concept of coverage risk as an error measure for density ridge estimation. The coverage risk generalizes the mean integrated square error to set estimation. We propose two risk estimators for the coverage risk and we show that we can select tuning parameters by minimizing the estimated risk. We study the rate of convergence for coverage risk and prove consistency of the risk estimators. We apply our method to three simulated datasets and to cosmology data. In all the examples, the proposed method successfully recover the underlying density structure.


Optimal Ridge Detection using Coverage Risk

Chen, Yen-Chi, Genovese, Christopher R., Ho, Shirley, Wasserman, Larry

arXiv.org Machine Learning

We introduce the concept of coverage risk as an error measure for density ridge estimation. The coverage risk generalizes the mean integrated square error to set estimation. We propose two risk estimators for the coverage risk and we show that we can select tuning parameters by minimizing the estimated risk. We study the rate of convergence for coverage risk and prove consistency of the risk estimators. We apply our method to three simulated datasets and to cosmology data. In all the examples, the proposed method successfully recover the underlying density structure.