Goto

Collaborating Authors

 Statistical Learning



Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms

Neural Information Processing Systems

Semi-supervised learning algorithms have been successfully applied in many applications withscarce labeled data, by utilizing the unlabeled data. One important category is graph based semi-supervised learning algorithms, for which the performance dependsconsiderably on the quality of the graph, or its hyperparameters. In this paper, we deal with the less explored problem of learning the graphs. We propose agraph learning method for the harmonic energy minimization method; this is done by minimizing the leave-one-out prediction error on labeled data points. We use a gradient based method and designed an efficient algorithm which significantly acceleratesthe calculation of the gradient by applying the matrix inversion lemma and using careful pre-computation. Experimental results show that the graph learning method is effective in improving the performance of the classification algorithm.


Simplifying Mixture Models through Function Approximation

Neural Information Processing Systems

Finite mixture model is a powerful tool in many statistical learning problems. In this paper, we propose a general, structure-preserving approach to reduce its model complexity, which can bring significant computational benefits in many applications. The basic idea is to group the original mixture components into compact clusters, and then minimize an upper bound on the approximation error between the original and simplified models.


Doubly Stochastic Normalization for Spectral Clustering

Neural Information Processing Systems

In this paper we focus on the issue of normalization of the affinity matrix in spectral clustering.We show that the difference between N-cuts and Ratio-cuts is in the error measure being used (relative-entropy versus L


Stochastic Relational Models for Discriminative Link Prediction

Neural Information Processing Systems

We introduce a Gaussian process (GP) framework, stochastic relational models (SRM), for learning social, physical, and other relational phenomena where interactions betweenentities are observed. The key idea is to model the stochastic structure of entity relationships (i.e., links) via a tensor interaction of multiple GPs, each defined on one type of entities. These models in fact define a set of nonparametric priors on infinite dimensional tensor matrices, where each element represents a relationship between a tuple of entities. By maximizing the marginalized likelihood,information is exchanged between the participating GPs through the entire relational network, so that the dependency structure of links is messaged to the dependency of entities, reflected by the adapted GP kernels. The framework offers a discriminative approach to link prediction, namely, predicting the existences, strengths,or types of relationships based on the partially observed linkage network as well as the attributes of entities (if given). We discuss properties and variants of SRM and derive an efficient learning algorithm. Very encouraging experimental resultsare achieved on a toy problem and a user-movie preference link prediction task. In the end we discuss extensions of SRM to general relational learning tasks.


A Local Learning Approach for Clustering

Neural Information Processing Systems

We present a local learning approach for clustering. The basic idea is that a good clustering result should have the property that the cluster label of each data point can be well predicted based on its neighboring data and their cluster labels, using currentsupervised learning methods. An optimization problem is formulated such that its solution has the above property. Relaxation and eigen-decomposition are applied to solve this optimization problem. We also briefly investigate the parameter selectionissue and provide a simple parameter selection method for the proposed algorithm. Experimental results are provided to validate the effectiveness ofthe proposed approach.



Online Clustering of Moving Hyperplanes

Neural Information Processing Systems

We propose a recursive algorithm for clustering trajectories lying in multiple moving hyperplanes.Starting from a given or random initial condition, we use normalized gradientdescent to update the coefficients of a time varying polynomial whose degree is the number of hyperplanes and whose derivatives at a trajectory give an estimate of the vector normal to the hyperplane containing that trajectory. As time proceeds, the estimates of the hyperplane normals are shown to track their true values in a stable fashion. The segmentation of the trajectories is then obtained by clustering their associated normal vectors. The final result is a simple recursive algorithm for segmenting a variable number of moving hyperplanes. We test our algorithm on the segmentation of dynamic scenes containing rigid motions anddynamic textures, e.g., a bird floating on water. Our method not only segments the bird motion from the surrounding water motion, but also determines patterns of motion in the scene (e.g., periodic motion) directly from the temporal evolution of the estimated polynomial coefficients. Our experiments also show that our method can deal with appearing and disappearing motions in the scene.


A Complexity-Distortion Approach to Joint Pattern Alignment

Neural Information Processing Systems

Image Congealing (1C) is a nonparametric method for the joint alignment of a collection of images affected by systematic and unwanted deformations.


Scalable Discriminative Learning for Natural Language Parsing and Translation

Neural Information Processing Systems

Parsing and translating natural languages can be viewed as problems of predicting treestructures. For machine learning approaches to these predictions, the diversity and high dimensionality of the structures involved mandate very large training sets. This paper presents a purely discriminative learning method that scales up well to problems of this size. Its accuracy was at least as good as other comparable methods on a standard parsing task. To our knowledge, it is the first purely discriminative learning algorithm for translation with treestructured models.Unlike other popular methods, this method does not require a great deal of feature engineering a priori, because it performs feature selection overa compound feature space as it learns. Experiments demonstrate the method's versatility, accuracy, and efficiency. Relevant software is freely available at http://nlp.cs.nyu.edu/parser and http://nlp.cs.nyu.edu/GenPar.