Goto

Collaborating Authors

 Country


VLSI Model of Primate Visual Smooth Pursuit

Neural Information Processing Systems

A one dimensional model of primate smooth pursuit mechanism has been implemented in 2 11m CMOS VLSI. The model consolidates Robinson's negative feedback model with Wyatt and Pola's positive feedback scheme, to produce a smooth pursuit system which zero's the velocity of a target on the retina. Furthermore, the system uses the current eye motion as a predictor for future target motion. Analysis, stability and biological correspondence of the system are discussed. For implementation at the focal plane, a local correlation based visual motion detection technique is used. Velocity measurements, ranging over 4 orders of magnitude with 15% variation, provides the input to the smooth pursuit system. The system performed successful velocity tracking for high contrast scenes. Circuit design and performance of the complete smooth pursuit system is presented.


Neuron-MOS Temporal Winner Search Hardware for Fully-Parallel Data Processing

Neural Information Processing Systems

Search for the largest (or the smallest) among a number of input data, Le., the winner-take-all (WTA) action, is an essential part of intelligent data processing such as data retrieval in associative memories [3], vector quantization circuits [4], Kohonen's self-organizing maps [5] etc. In addition to the maximum or minimum search, data sorting also plays an essential role in a number of signal processing such as median filtering in image processing, evolutionary algorithms in optimizing problems [6] and so forth.


Does the Wake-sleep Algorithm Produce Good Density Estimators?

Neural Information Processing Systems

The wake-sleep algorithm (Hinton, Dayan, Frey and Neal 1995) is a relatively efficientmethod of fitting a multilayer stochastic generative model to high-dimensional data. In addition to the top-down connections inthe generative model, it makes use of bottom-up connections for approximating the probability distribution over the hidden units given the data, and it trains these bottom-up connections using a simple delta rule. We use a variety of synthetic and real data sets to compare the performance ofthe wake-sleep algorithm with Monte Carlo and mean field methods for fitting the same generative model and also compare it with other models that are less powerful but easier to fit. 1 INTRODUCTION Neural networks are often used as bottom-up recognition devices that transform input vectors intorepresentations of those vectors in one or more hidden layers. But multilayer networks ofstochastic neurons can also be used as top-down generative models that produce patterns with complicated correlational structure in the bottom visible layer. In this paper we consider generative models composed of layers of stochastic binary logistic units. Given a generative model parameterized by top-down weights, there is an obvious way to perform unsupervised learning. The generative weights are adjusted to maximize the probability thatthe visible vectors generated by the model would match the observed data.


Using Unlabeled Data for Supervised Learning

Neural Information Processing Systems

Geoffrey Towell Siemens Corporate Research 755 College Road East Princeton, NJ 08540 Abstract Many classification problems have the property that the only costly part of obtaining examples is the class label. This paper suggests a simple method for using distribution information contained in unlabeled examples to augment labeled examples in a supervised training framework. Empirical tests show that the technique described inthis paper can significantly improve the accuracy of a supervised learner when the learner is well below its asymptotic accuracy level. 1 INTRODUCTION Supervised learning problems often have the following property: unlabeled examples have little or no cost while class labels have a high cost. For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats.


Is Learning The n-th Thing Any Easier Than Learning The First?

Neural Information Processing Systems

This paper investigates learning in a lifelong context. Lifelong learning addresses situations in which a learner faces a whole stream of learning tasks.Such scenarios provide the opportunity to transfer knowledge across multiple learning tasks, in order to generalize more accurately from less training data. In this paper, several different approaches to lifelong learning are described, and applied in an object recognition domain. It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks. 1 Introduction Supervised learning is concerned with approximating an unknown function based on examples. Virtuallyall current approaches to supervised learning assume that one is given a set of input-output examples, denoted by X, which characterize an unknown function, denoted by f.


A Multiscale Attentional Framework for Relaxation Neural Networks

Neural Information Processing Systems

Many practical problems in computer vision, pattern recognition, robotics and other areas can be described in terms of constrained optimization. In the past decade, researchers have proposed means of solving such problems with the use of neural networks [Hopfield & Tank, 1985; Koch et ai., 1986], which are thus derived as relaxation dynamics for the objective functions codifying the optimization task. One disturbing aspect of the approach soon became obvious, namely the apparent inabilityof the methods to scale up to practical problems, the principal reason being the rapid increase in the number of local minima present in the objectives as the dimension of the problem increases. Moreover most objectives, E(v), are highly nonlinear, non-convex functions of v, and simple techniques (e.g.


Softassign versus Softmax: Benchmarks in Combinatorial Optimization

Neural Information Processing Systems

Steven Gold Department of Computer Science Yale University New Haven, CT 06520-8285 AnandRangarajan Dept. of Diagnostic Radiology Yale University New Haven, CT 06520-8042 Abstract A new technique, termed soft assign, is applied for the first time to two classic combinatorial optimization problems, the traveling salesmanproblem and graph partitioning. Softassign, which has emerged from the recurrent neural network/statistical physics framework, enforces two-way (assignment) constraints without the use of penalty terms in the energy functions. The softassign can also be generalized from two-way winner-take-all constraints to multiple membership constraints which are required for graph partitioning. Thesoftassign technique is compared to the softmax (Potts glass). Within the statistical physics framework, softmax and a penalty term has been a widely used method for enforcing the two-way constraints common within many combinatorial optimization problems.The benchmarks present evidence that softassign has clear advantages in accuracy, speed, parallelizabilityand algorithmic simplicityover softmax and a penalty term in optimization problems with two-way constraints. 1 Introduction In a series of papers in the early to mid 1980's, Hopfield and Tank introduced techniques which allowed one to solve combinatorial optimization problems with recurrent neural networks [Hopfield and Tank, 1985].


SPERT-II: A Vector Microprocessor System and its Application to Large Problems in Backpropagation Training

Neural Information Processing Systems

We report on our development of a high-performance system for neural network and other signal processing applications. We have designed and implemented a vector microprocessor and packaged itas an attached processor for a conventional workstation.


Finite State Automata that Recurrent Cascade-Correlation Cannot Represent

Neural Information Processing Systems

This paper relates the computational power of Fahlman' s Recurrent Cascade Correlation (RCC) architecture to that of fInite state automata (FSA). While some recurrent networks are FSA equivalent, RCC is not. The paper presents a theoretical analysis of the RCC architecture in the form of a proof describing a large class of FSA which cannot be realized by RCC. 1 INTRODUCTION Recurrent networks can be considered to be defmed by two components: a network architecture, and a learning rule. The former describes how a network with a given set of weights and topology computes its output values, while the latter describes how the weights (and possibly topology) of the network are updated to fIt a specifIc problem. It is possible to evaluate the computational power of a network architecture by analyzing the types of computations a network could perform assuming appropriate connection weights (and topology).


From Isolation to Cooperation: An Alternative View of a System of Experts

Neural Information Processing Systems

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Eachexpert is trained by minimizing a penalized local cross validation errorusing second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive fieldin which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.