Goto

Collaborating Authors

 Country


The Decision List Machine

Neural Information Processing Systems

We introduce a new learning algorithm for decision lists to allow features that are constructed from the data and to allow a tradeoff betweenaccuracy and complexity. We bound its generalization error in terms of the number of errors and the size of the classifier it finds on the training data. We also compare its performance on some natural data sets with the set covering machine and the support vector machine.


Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

Neural Information Processing Systems

Machine learning has reached a point where many probabilistic methods canbe understood as variations, extensions and combinations of a much smaller set of abstract themes, e.g., as different instances of the EM algorithm. This enables the systematic derivation of algorithms customized fordifferent models. Here, we describe the AUTOBAYES system which takes a high-level statistical model specification, uses powerful symbolic techniques based on schema-based program synthesis and computer algebra to derive an efficient specialized algorithm for learning that model, and generates executable code implementing that algorithm. This capability is far beyond that of code collections such as Matlab toolboxes oreven tools for model-independent optimization such as BUGS for Gibbs sampling: complex new algorithms can be generated without newprogramming, algorithms can be highly specialized and tightly crafted for the exact structure of the model and data, and efficient and commented code can be generated for different languages or systems.



Stochastic Neighbor Embedding

Neural Information Processing Systems

We describe a probabilistic approach to the task of placing objects, described byhigh-dimensional vectors or by pairwise dissimilarities, in a low-dimensional space in a way that preserves neighbor identities. A Gaussian is centered on each object in the high-dimensional space and the densities under this Gaussian (or the given dissimilarities) are used to define a probability distribution over all the potential neighbors of the object. The aim of the embedding is to approximate this distribution aswell as possible when the same operation is performed on the low-dimensional "images" of the objects. A natural cost function is a sum of Kullback-Leibler divergences, one per object, which leads to a simple gradient for adjusting the positions of the low-dimensional images. Unlikeother dimensionality reduction methods, this probabilistic framework makes it easy to represent each object by a mixture of widely separated low-dimensional images. This allows ambiguous objects, like the document count vector for the word "bank", to have versions close to the images of both "river" and "finance" without forcing the images of outdoor concepts to be located close to those of corporate concepts.


Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

Neural Information Processing Systems

We consider data which are images containing views of multiple objects. Our task is to learn about each of the objects present in the images. This task can be approached as a factorial learning problem, where each image must be explained by instantiating a model for each of the objects present with the correct instantiation parameters. A major problem with learning a factorial model is that as the number of objects increases, there is a combinatorial explosion of the number of configurations that need to be considered. We develop a method to extract object models sequentially from the data by making use of a robust statistical method, thus avoiding thecombinatorial explosion, and present results showing successful extraction of objects from real images.


Clustering with the Fisher Score

Neural Information Processing Systems

Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved inthe space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithmspecialized for the Fisher score, which can exploit important dimensions.This algorithm is successfully tested in experiments with artificial data and real data (amino acid sequences).


Selectivity and Metaplasticity in a Unified Calcium-Dependent Model

Neural Information Processing Systems

A unified, biophysically motivated Calcium-Dependent Learning model has been shown to account for various rate-based and spike time-dependent paradigms for inducing synaptic plasticity. Here, we investigate the properties of this model for a multi-synapse neuron that receives inputs with different spike-train statistics. In addition, we present a physiological form of metaplasticity, an activity-driven regulation mechanism, that is essential for the robustness ofthe model.


Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

Neural Information Processing Systems

The responses of cortical sensory neurons are notoriously variable, with the number of spikes evoked by identical stimuli varying significantly from trial to trial. This variability is most often interpreted as'noise', purely detrimental to the sensory system. In this paper, we propose an alternative viewin which the variability is related to the uncertainty, about world parameters, which is inherent in the sensory stimulus. Specifically, theresponses of a population of neurons are interpreted as stochastic samples from the posterior distribution in a latent variable model. In addition to giving theoretical arguments supporting such a representational scheme,we provide simulations suggesting how some aspects of response variability might be understood in this framework.


Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model

Neural Information Processing Systems

The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motionfrom the optic flow. We present a theory for the construction ofan estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution ofthe environment, and about the noise and self-motion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional visionsensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimatesturn out to be less reliable.


Hyperkernels

Neural Information Processing Systems

We consider the problem of choosing a kernel suitable for estimation using a Gaussian Process estimator or a Support Vector Machine. A novel solution is presented which involves defining a Reproducing Kernel HilbertSpace on the space of kernels itself. By utilizing an analog of the classical representer theorem, the problem of choosing a kernel from a parameterized family of kernels (e.g. of varying width) is reduced to a statistical estimation problem akin to the problem of minimizing a regularized risk functional. Various classical settings for model or kernel selection are special cases of our framework.