Goto

Collaborating Authors

 Technology


A New Model of Spatial Representation in Multimodal Brain Areas

Neural Information Processing Systems

Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frameof reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction,multi-sensory integration, sensory-motor transformation andmotor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location.


Temporally Dependent Plasticity: An Information Theoretic Account

Neural Information Processing Systems

It should be stressed that in our model information is coded in the non-stationary rates that underlie the input spike trains. As these rates are not observable, any learning must depends on the observable input spikes that realize those underlying rates.


Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra

Neural Information Processing Systems

A system has been developed to extract diagnostic information from jet engine carcass vibration data. Support Vector Machines applied to novelty detectionprovide a measure of how unusual the shape of a vibration signatureis, by learning a representation of normality. We describe a novel method for Support Vector Machines of including information from a second class for novelty detection and give results from the application toJet Engine vibration analysis.


A Neural Probabilistic Language Model

Neural Information Processing Systems

A goal of statistical language modeling is to learn the joint probability function of sequences of words. This is intrinsically difficult because of the curse of dimensionality: we propose to fight it with its own weapons. In the proposed approach one learns simultaneously (1) a distributed representation foreach word (i.e. a similarity between words) along with (2) the probability function for word sequences, expressed with these representations. Generalizationis obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar to words forming an already seen sentence. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach very significantly improves ona state-of-the-art trigram model. 1 Introduction A fundamental problem that makes language modeling and other learning problems difficult isthe curse of dimensionality. It is particularly obvious in the case when one wants to model the joint distribution between many discrete random variables (such as words in a sentence, or discrete attributes in a data-mining task).


FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks

Neural Information Processing Systems

FaceSync is an optimal linear algorithm that finds the degree of synchronization betweenthe audio and image recordings of a human speaker. Using canonical correlation, it finds the best direction to combine allthe audio and image data, projecting them onto a single axis. FaceSync uses Pearson's correlation to measure the degree of synchronization betweenthe audio and image data. We derive the optimal linear transform to combine the audio and visual information and describe an implementation that avoids the numerical problems caused by computing thecorrelation matrices.


Who Does What? A Novel Algorithm to Determine Function Localization

Neural Information Processing Systems

We introduce a novel algorithm, termed PPA (Performance Prediction Algorithm), that quantitatively measures the contributions of elements of a neural system to the tasks it performs. The algorithm identifies the neurons or areas which participate in a cognitive or behavioral task, given data about performance decrease in a small set of lesions. It also allows the accurate prediction of performances due to multi-element lesions. The effectiveness of the new algorithm is demonstrated in two models of recurrent neural networks with complex interactions among the elements. Thealgorithm is scalable and applicable to the analysis of large neural networks. Given the recent advances in reversible inactivation techniques, it has the potential to significantly contribute to the understanding ofthe organization of biological nervous systems, and to shed light on the long-lasting debate about local versus distributed computation inthe brain.


Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

Neural Information Processing Systems

Two well known classes of unsupervised procedures that can be cast in this manner are generative and recoding models. In a generative unsupervised framework, the environment generates training exampleswhich we will refer to as observations-by sampling from one distribution; the other distribution is embodied in the model. Examples of generative frameworks are mixtures of Gaussians (MoG) [2], factor analysis [4], and Boltzmann machines [8]. In the recoding unsupervised framework, the model transforms points from an obser- vation space to an output space, and the output distribution is compared either to a reference distribution or to a distribution derived from the output distribution.


Universality and Individuality in a Neural Code

Neural Information Processing Systems

This basic question in the theory of knowledge seems to be beyond the scope of experimental investigation. An accessible version of this question is whether different observers of the same sense data have the same neural representation of these data: how much of the neural code is universal, and how much is individual? Differences in the neural codes of different individuals may arise from various sources: First, different individuals may use different'vocabularies' of coding symbols. Second, they may use the same symbols to encode different stimulus features.


The Unscented Particle Filter

Neural Information Processing Systems

In this paper, we propose a new particle filter based on sequential importance sampling. The algorithm uses a bank of unscented filters toobtain the importance proposal distribution. This proposal has two very "nice" properties. Firstly, it makes efficient use of the latest available information and, secondly, it can have heavy tails. As a result, we find that the algorithm outperforms standard particlefiltering and other nonlinear filtering methods very substantially.


Accumulator Networks: Suitors of Local Probability Propagation

Neural Information Processing Systems

The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probabilityfunctions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, wegive results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependenciesbetween variables, filling in missing data and making Bayesoptimal decisionsusing probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992). Large, richly-connected networks with many cycles can potentially beused to model complex sources of data, such as audio signals, images and video. However, when the number of cycles in the network is large (more precisely, when the cut set size is exponential), exact inference becomes intractable. Also, to learn a probability model with hidden variables, we need to fill in the missing data using probabilistic inference, so learning also becomes intractable. To cope with the intractability of exact inference, a variety of approximate inference methods have been invented, including Monte Carlo (Hinton and Sejnowski 1986; Neal 1992), Helmholz machines (Dayan et al. 1995; Hinton et al. 1995), and variational techniques (Jordan et al. 1998).