Goto

Collaborating Authors

 Williams, Christopher K. I.


On a Connection between Kernel PCA and Metric Multidimensional Scaling

Neural Information Processing Systems

In this paper we show that the kernel peA algorithm of Sch6lkopf et al (1998) can be interpreted as a form of metric multidimensional scaling (MDS) when the kernel function k(x, y) is isotropic, i.e. it depends only on Ilx - yll. This leads to a metric MDS algorithm where the desired configuration of points is found via the solution of an eigenproblem rather than through the iterative optimization of the stress objective function. The question of kernel choice is also discussed.




On a Connection between Kernel PCA and Metric Multidimensional Scaling

Neural Information Processing Systems

This leads to a metric MDS algorithm where the desired configuration of points is found via the solution of an eigenproblem rather than through the iterative optimization of the stress objective function. The question of kernel choice is also discussed. 1 Introduction Suppose we are given n objects, and for each pair (i,j) we have a measurement of the "dissimilarity" Oij between the two objects. In multidimensional scaling (MDS) the aim is to place n points in a low dimensional space (usually Euclidean) so that the interpoint distances dij have a particular relationship to the original dissimilarities. In classical scaling we would like the interpoint distances to be equal to the dissimilarities. For example, classical scaling can be used to reconstruct a map of the locations of some cities given the distances between them.


A MCMC Approach to Hierarchical Mixture Modelling

Neural Information Processing Systems

There are many hierarchical clustering algorithms available, but these lack a firm statistical basis. Here we set up a hierarchical probabilistic mixture model, where data is generated in a hierarchical tree-structured manner. Markov chain Monte Carlo (MCMC) methods are demonstrated which can be used to sample from the posterior distribution over trees containing variable numbers of hidden units.


A MCMC Approach to Hierarchical Mixture Modelling

Neural Information Processing Systems

There are many hierarchical clustering algorithms available, but these lack a firm statistical basis. Here we set up a hierarchical probabilistic mixture model, where data is generated in a hierarchical tree-structured manner. Markov chain Monte Carlo (MCMC) methods are demonstrated which can be used to sample from the posterior distribution over trees containing variable numbers of hidden units.


Discovering Hidden Features with Gaussian Processes Regression

Neural Information Processing Systems

W is often taken to be diagonal, but if we allow W to be a general positive definite matrix which can be tuned on the basis of training data, then an eigen-analysis of W shows that we are effectively creating hidden features, where the dimensionality of the hidden-feature space is determined by the data. We demonstrate the superiority of predictions using the general matrix over those based on a diagonal matrix on two test problems.


Finite-Dimensional Approximation of Gaussian Processes

Neural Information Processing Systems

Gaussian process (GP) prediction suffers from O(n3) scaling with the data set size n. By using a finite-dimensional basis to approximate the GP predictor, the computational complexity can be reduced. We derive optimal finite-dimensional predictors under a number of assumptions, and show the superiority of these predictors over the Projected Bayes Regression method (which is asymptotically optimal). We also show how to calculate the minimal model size for a given n. The calculations are backed up by numerical experiments.


DTs: Dynamic Trees

Neural Information Processing Systems

A dynamic tree model specifies a prior over a large number of trees, each one of which is a tree-structured belief net (TSBN). Our aim is to retain the advantages of tree-structured belief networks, namely the hierarchical structure of the model and (in part) the efficient inference algorithms, while avoiding the "blocky" artifacts that derive from a single, fixed TSBN structure. One use for DTs is as prior models over labellings for image segmentation problems. Section 2 of the paper gives the theory of DTs, and experiments are described in section 3. 2 Theory There are two essential components that make up a dynamic tree network (i) the tree architecture and (ii) the nodes and conditional probability tables (CPTs) in the given tree. We consider the architecture question first.


DTs: Dynamic Trees

Neural Information Processing Systems

A dynamic tree model specifies a prior over a large number of trees, each one of which is a tree-structured belief net (TSBN) . Our aim is to retain the advantages of tree-structured belief networks, namely the hierarchical structure of the model and (in part) the efficient inference algorithms, while avoiding the "blocky" artifacts that derive from a single, fixed TSBN structure. One use for DTs is as prior models over labellings for image segmentation problems.