Goto

Collaborating Authors

 Country


Products of Gaussians

Neural Information Processing Systems

Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data. Below we consider PoE models in which each expert is a Gaussian. Although the product of Gaussians is also a Gaussian, if each Gaussian has a simple structure the product can have a richer structure. We examine (1) Products of Gaussian pancakes which give rise to probabilistic Minor Components Analysis, (2) products of I-factor PPCA models and (3) a products of experts construction for an AR(l) process. Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data. In this paper we consider PoE models in which each expert is a Gaussian. It is easy to see that in this case the product model will also be Gaussian. However, if each Gaussian has a simple structure, the product can have a richer structure. Using Gaussian experts is attractive as it permits a thorough analysis of the product architecture, which can be difficult with other models, e.g.


Learning Lateral Interactions for Feature Binding and Sensory Segmentation

Neural Information Processing Systems

We present a new approach to the supervised learning of lateral interactions for the competitive layer model (CLM) dynamic feature binding architecture. The method is based on consistency conditions, which were recently shown to characterize the attractor states of this linear threshold recurrent network. For a given set of training examples the learning problem is formulated as a convex quadratic optimization problem in the lateral interaction weights. An efficient dimension reduction of the learning problem can be achieved by using a linear superposition of basis interactions. We show the successful application of the method to a medical image segmentation problem of fluorescence microscope cell images.


Multi Dimensional ICA to Separate Correlated Sources

Neural Information Processing Systems

There are two linear transformations to be considered, one operating inside the channels (0) and one operating between the different channels (W). The two transformations are estimated in two adjacent leA steps. There are mainly two advantages, that can be taken from the first transformation: (i) By arranging independence among the columns of the transformed patches, the average transinformation between different channels is decreased.


K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms

Neural Information Processing Systems

Guided by an initial idea of building a complex (non linear) decision surface with maximal local margin in input space, we give a possible geometrical intuition as to why K-Nearest Neighbor (KNN) algorithms often perform more poorly than SVMs on classification tasks. We then propose modified K-Nearest Neighbor algorithms to overcome the perceived problem. The approach is similar in spirit to Tangent Distance, but with invariances inferred from the local neighborhood rather than prior knowledge. Experimental results on real world classification tasks suggest that the modified KNN algorithms often give a dramatic improvement over standard KNN and perform as well or better than SVMs.



Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

Neural Information Processing Systems

The marriage of Renyi entropy with Parzen density estimation has been shown to be a viable tool in learning discriminative feature transforms. However, it suffers from computational complexity proportional to the square of the number of samples in the training data. This sets a practical limit to using large databases. We suggest immediate divorce of the two methods and remarriage of Renyi entropy with a semi-parametric density estimation method, such as a Gaussian Mixture Models (GMM). This allows all of the computation to take place in the low dimensional target space, and it reduces computational complexity proportional to square of the number of components in the mixtures. Furthermore, a convenient extension to Hidden Markov Models as commonly used in speech recognition becomes possible.


Risk Sensitive Particle Filters

Neural Information Processing Systems

We propose a new particle filter that incorporates a model of costs when generating particles. The approach is motivated by the observation that the costs of accidentally not tracking hypotheses might be significant in some areas of state space, and next to irrelevant in others. By incorporating a cost model into particle filtering, states that are more critical to the system performance are more likely to be tracked. Automatic calculation of the cost model is implemented using an MDP value function calculation that estimates the value of tracking a particular state. Experiments in two mobile robot domains illustrate the appropriateness of the approach.



Bayesian time series classification

Neural Information Processing Systems

This paper proposes an approach to classification of adjacent segments of a time series as being either of classes. We use a hierarchical model that consists of a feature extraction stage and a generative classifier which is built on top of these features. Such two stage approaches are often used in signal and image processing. The novel part of our work is that we link these stages probabilistically by using a latent feature space. To use one joint model is a Bayesian requirement, which has the advantage to fuse information according to its certainty.


Agglomerative Multivariate Information Bottleneck

Neural Information Processing Systems

The information bottleneck method is an unsupervised model independent data organization technique. Given a joint distribution peA, B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are informative about B. In a recent paper, we introduced a general principled framework for multivariate extensions of the information bottleneck method that allows us to consider multiple systems of data partitions that are interrelated. In this paper, we present a new family of simple agglomerative algorithms to construct such systems of interrelated clusters. We analyze the behavior of these algorithms and apply them to several real-life datasets.