Country
Analysis of Contour Motions
Liu, Ce, Freeman, William T., Adelson, Edward H.
A reliable motion estimation algorithm must function under a wide range of conditions. Oneregime, which we consider here, is the case of moving objects with contours but no visible texture. Tracking distinctive features such as corners can disambiguate the motion of contours, but spurious features such as T-junctions can be badly misleading. It is difficult to determine the reliability of motion from local measurements, since a full rank covariance matrix can result from both real and spurious features. We propose a novel approach that avoids these points altogether, andderives global motion estimates by utilizing information from three levels of contour analysis: edgelets, boundary fragments and contours.
Attribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions
Long, Philip M., Servedio, Rocco
We consider the well-studied problem of learning decision lists using few examples whenmany irrelevant features are present. We show that smooth boosting algorithms suchas MadaBoost can efficiently learn decision lists of length k over n boolean variables using poly(k, log n) many examples provided that the marginal distribution over the relevant variables is "not too concentrated" in an L
Differential Entropic Clustering of Multivariate Gaussians
Davis, Jason V., Dhillon, Inderjit S.
Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application.
Convergence of Laplacian Eigenmaps
Belkin, Mikhail, Niyogi, Partha
Geometrically based methods for various tasks of machine learning have attracted considerable attention over the last few years. In this paper we show convergence of eigenvectors of the point cloud Laplacian to the eigenfunctions ofthe Laplace-Beltrami operator on the underlying manifold, thus establishing the first convergence results for a spectral dimensionality reduction algorithmin the manifold setting.
Fundamental Limitations of Spectral Clustering
Spectral clustering methods are common graph-based approaches to clustering of data. Spectral clustering algorithms typically start from local information encoded in a weighted graph on the data and cluster according to the global eigenvectors of the corresponding (normalized) similarity matrix. One contribution of this paper is to present fundamental limitations of this general local to global approach. We show that based only on local information, the normalized cut functional is not a suitable measure for the quality of clustering. Further, even with a suitable similarity measure,we show that the first few eigenvectors of such adjacency matrices cannot successfully cluster datasets that contain structures at different scales of size and density. Based on these findings, a second contribution of this paper is a novel diffusion based measure to evaluate the coherence of individual clusters. Our measure can be used in conjunction with any bottom-up graph-based clustering method,it is scale-free and can determine coherent clusters at all scales. We present both synthetic examples and real image segmentation problems where various spectralclustering algorithms fail. In contrast, using this coherence measure finds the expected clusters at all scales.
On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
Narayanan, Hariharan, Belkin, Mikhail, Niyogi, Partha
One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary.
PAC-Bayes Bounds for the Risk of the Majority Vote and the Variance of the Gibbs Classifier
Lacasse, Alexandre, Laviolette, François, Marchand, Mario, Germain, Pascal, Usunier, Nicolas
We propose new PAC-Bayes bounds for the risk of the weighted majority vote that depend on the mean and variance of the error of its associated Gibbs classifier. We show that these bounds can be smaller than the risk of the Gibbs classifier and can be arbitrarily close to zero even if the risk of the Gibbs classifier is close to 1/2. Moreover, we show that these bounds can be uniformly estimated on the training data for all possible posteriors Q. Moreover, they can be improved by using a large sample of unlabelled data.