Goto

Collaborating Authors

 Zemel, Richard S.


Learning Hybrid Models for Image Annotation with Partially Labeled Data

Neural Information Processing Systems

Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image labels. Tests of the new models and some baseline approaches on two real image datasets demonstrate the effectiveness of incorporating the latent structure.


Generative versus discriminative training of RBMs for classification of fMRI images

Neural Information Processing Systems

Neuroimaging datasets often have a very large number of voxels and a very small number of training cases, which means that overfitting of models for this data can become a very serious problem. Working with a set of fMRI images from a study on stroke recovery, we consider a classification task for which logistic regression performs poorly, even when L1-or L2-regularized. We show that much better discrimination can be achieved by fitting a generative model to each separate condition and then seeing which model is most likely to have generated the data. We compare discriminative training of exactly the same set of models, and we also consider convex blends of generative and discriminative training.


Characterizing response behavior in multisensory perception with conflicting cues

Neural Information Processing Systems

We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects' response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models.



Probabilistic Computation in Spiking Populations

Neural Information Processing Systems

As animals interact with their environments, they must constantly update estimates about their states. Bayesian models combine prior probabilities, adynamical model and sensory evidence to update estimates optimally. Thesemodels are consistent with the results of many diverse psychophysical studies. However, little is known about the neural representation andmanipulation of such Bayesian information, particularly in populations of spiking neurons. We consider this issue, suggesting a model based on standard neural architecture and activations. We illustrate theapproach on a simple random walk example, and apply it to a sensorimotor integration task that provides a particularly compelling example of dynamic probabilistic computation. Bayesian models have been used to explain a gamut of experimental results in tasks which require estimates to be derived from multiple sensory cues.


Proximity Graphs for Clustering and Manifold Learning

Neural Information Processing Systems

Many machine learning algorithms for clustering or dimensionality reduction takeas input a cloud of points in Euclidean space, and construct a graph with the input data points as vertices. This graph is then partitioned (clustering)or used to redefine metric information (dimensionality reduction). There has been much recent work on new methods for graph-based clustering and dimensionality reduction, but not much on constructing the graph itself. Graphs typically used include the fullyconnected graph,a local fixed-grid graph (for image segmentation) or a nearest-neighbor graph. We suggest that the graph should adapt locally to the structure of the data. This can be achieved by a graph ensemble that combines multiple minimum spanning trees, each fit to a perturbed version of the data set. We show that such a graph ensemble usually produces abetter representation of the data manifold than standard methods; and that it provides robustness to a subsequent clustering or dimensionality reductionalgorithm based on the graph.


Multiple Cause Vector Quantization

Neural Information Processing Systems

We propose a model that can learn parts-based representations of highdimensional data. Our key assumption is that the dimensions of the data can be separated into several disjoint subsets, or factors, which take on values independently of each other. We assume each factor has a small number of discrete states, and model it using a vector quantizer. The selected states of each factor represent the multiple causes of the input. Given a set of training examples, our model learns the association of data dimensions with factors, as well as the states of each VQ. Inference and learning are carried out efficiently via variational algorithms.


Self Supervised Boosting

Neural Information Processing Systems

Boosting algorithms and successful applications thereof abound for classification andregression learning problems, but not for unsupervised learning. We propose a sequential approach to adding features to a random fieldmodel by training them to improve classification performance between the data and an equal-sized sample of "negative examples" generated fromthe model's current estimate of the data density.



A Gradient-Based Boosting Algorithm for Regression Problems

Neural Information Processing Systems

Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typically binary-valued. The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1),..., (xn, Yn)}, drawn randomly from X x Yaccording to distribution D; it outputs a hypothesis I: