Goto

Collaborating Authors

 Country


Relational Learning with Gaussian Processes

Neural Information Processing Systems

Correlation between instances is often modelled via a kernel function using input attributesof the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relational information and input attributesusing Gaussian process techniques. This approach provides a novel nonparametric Bayesian framework with a data-dependent covariance function for supervised learning tasks. We also apply this framework to semi-supervised learning. Experimental results on several real world data sets verify the usefulness of this algorithm.



Learnability and the doubling dimension

Neural Information Processing Systems

We prove bounds on the sample complexity of PAC learning in terms of the doubling dimension of this metric. These bounds imply known bounds on the sample complexity of learning halfspaces with respect to the uniform distribution that are optimal up to a constant factor.





A Local Learning Approach for Clustering

Neural Information Processing Systems

We present a local learning approach for clustering. The basic idea is that a good clustering result should have the property that the cluster label of each data point can be well predicted based on its neighboring data and their cluster labels, using currentsupervised learning methods. An optimization problem is formulated such that its solution has the above property. Relaxation and eigen-decomposition are applied to solve this optimization problem. We also briefly investigate the parameter selectionissue and provide a simple parameter selection method for the proposed algorithm. Experimental results are provided to validate the effectiveness ofthe proposed approach.


Bayesian Ensemble Learning

Neural Information Processing Systems

We develop a Bayesian "sum-of-trees" model, named BART, where each tree is constrained by a prior to be a weak learner. Fitting and inference are accomplished via an iterative backfitting MCMC algorithm. This model is motivated by ensemble methodsin general, and boosting algorithms in particular. Like boosting, each weak learner (i.e., each weak tree) contributes a small amount to the overall model. However, our procedure is defined by a statistical model: a prior and a likelihood, while boosting is defined by an algorithm. This model-based approach enables a full and accurate assessment of uncertainty in model predictions, while remaining highly competitive in terms of predictive accuracy.


Modeling Human Motion Using Binary Latent Variables

Neural Information Processing Systems

The latent and Visible variables at each time step receive directed connections from the Visible variables at the last few time-steps. Such an architecture makes online inference efficient and allows us to use a simple approximate learning procedure.


Recursive Attribute Factoring

Neural Information Processing Systems

Clustering, or factoring of a document collection attempts to "explain" each observed documentin terms of one or a small number of inferred prototypes. Prior work demonstrated that when links exist between documents in the corpus (as is the case with a collection of web pages or scientific papers), building a joint model of document contents and connections produces a better model than that built from contents or connections alone. Many problems arise when trying to apply these joint models to corpus at the scale of the World Wide Web, however; one of these is that the sheer overhead of representing a feature space on the order of billions of dimensions becomes impractical. Weaddress this problem with a simple representational shift inspired by probabilistic relationalmodels: instead of representing document linkage in terms of the identities of linking documents, we represent it by the explicit and inferred attributes ofthe linking documents. Several surprising results come with this shift: in addition to being computationally more tractable, the new model produces factors thatmore cleanly decompose the document collection. We discuss several variations on this model and show how some can be seen as exact generalizations of the PageRank algorithm.