Goto

Collaborating Authors

 Country


Robust Reinforcement Learning in Motion Planning

Neural Information Processing Systems

While exploring to find better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, or even catastrophic, results, often modeled in terms of reaching'failure' states of the agent's environment. This paper presents a method that uses domain knowledge to reduce the number of failures during exploration. This method formulates the set of actions from which the RL agent composes a control policy to ensure that exploration is conducted in a policy space that excludes most of the unacceptable policies. The resulting action set has a more abstract relationship to the task being solved than is common in many applications of RL. Although the cost of this added safety is that learning may result in a suboptimal solution, we argue that this is an appropriate tradeoff in many problems. We illustrate this method in the domain of motion planning. "'This work was done while the first author was finishing his Ph.D in computer science at the University of Massachusetts, Amherst.




Neurobiology, Psychophysics, and Computational Models of Visual Attention

Neural Information Processing Systems

The purpose of this workshop was to discuss both recent experimental findings and computational models of the neurobiological implementation of selective attention. Recent experimental results were presented in two of the four presentations given (C.E. Connor, Washington University and B.C. Motter, SUNY and V.A. Medical Center, Syracuse), while the other two talks were devoted to computational models (E. Connor presented the results of an experiment in which the receptive field profiles of V 4 neurons were mapped during different states of attention in an awake, behaving monkey. The attentional focus was manipulated in this experiment by altering the position of a behaviorally relevant ring-shaped stimulus.


An Analog VLSI Saccadic Eye Movement System

Neural Information Processing Systems

In an effort to understand saccadic eye movements and their relation to visual attention and other forms of eye movements, we - in collaboration with a number of other laboratories - are carrying out a large-scale effort to design and build a complete primate oculomotor system using analog CMOS VLSI technology. Using this technology, a low power, compact, multi-chip system has been built which works in real-time using real-world visual inputs. We describe in this paper the performance of an early version of such a system including a 1-D array of photoreceptors mimicking the retina, a circuit computing the mean location of activity representing the superior colliculus, a saccadic burst generator, and a one degree-of-freedom rotational platform which models the dynamic properties of the primate oculomotor plant. 1 Introduction When we look around our environment, we move our eyes to center and stabilize objects of interest onto our fovea. In order to achieve this, our eyes move in quick jumps with short pauses in between. These quick jumps (up to 750 deg/sec in humans) are known as saccades and are seen in both exploratory eye movements and as reflexive eye movements in response to sudden visual, auditory, or somatosensory stimuli. Since the intent of the saccade is to bring new objects of interest onto the fovea, it can be considered a primitive attentional mechanism.


Hidden Markov Models for Human Genes

Neural Information Processing Systems

Human genes are not continuous but rather consist of short coding regions (exons) interspersed with highly variable non-coding regions (introns). We apply HMMs to the problem of modeling exons, introns and detecting splice sites in the human genome. Our most interesting result so far is the detection of particular oscillatory patterns, with a minimal period ofroughly 10 nucleotides, that seem to be characteristic of exon regions and may have significant biological implications.


Segmental Neural Net Optimization for Continuous Speech Recognition

Neural Information Processing Systems

Previously, we had developed the concept of a Segmental Neural Net (SNN) for phonetic modeling in continuous speech recognition (CSR). This kind of neu ral network technology advanced the state-of-the-art of large-vocabulary CSR, which employs Hidden Marlcov Models (HMM), for the ARPA 1oo0-word Resource Management corpus. More Recently, we started porting the neural net system to a larger, more challenging corpus - the ARPA 20,Ooo-word Wall Street Journal (WSJ) corpus. During the porting, we explored the following research directions to refine the system: i) training context-dependent models with a regularization method; ii) training SNN with projection pursuit; and ii) combining different models into a hybrid system. When tested on both a development set and an independent test set, the resulting neural net system alone yielded a perfonnance at the level of the HMM system, and the hybrid SNN/HMM system achieved a consistent 10-15% word error reduction over the HMM system. This paper describes our hybrid system, with emphasis on the optimization methods employed.


VLSI Phase Locking Architectures for Feature Linking in Multiple Target Tracking Systems

Neural Information Processing Systems

Recent physiological research has shown that synchronization of oscillatory responses in striate cortex may code for relationships between visual features of objects. A VLSI circuit has been designed to provide rapid phase-locking synchronization of multiple oscillators to allow for further exploration of this neural mechanism. By exploiting the intrinsic random transistor mismatch of devices operated in subthreshold, large groups of phase-locked oscillators can be readily partitioned into smaller phase-locked groups. A mUltiple target tracker for binary images is described utilizing this phase-locking architecture. A VLSI chip has been fabricated and tested to verify the architecture.


Coupled Dynamics of Fast Neurons and Slow Interactions

Neural Information Processing Systems

A simple model of coupled dynamics of fast neurons and slow interactions, modelling self-organization in recurrent neural networks, leads naturally to an effective statistical mechanics characterized by a partition function which is an average over a replicated system. This is reminiscent of the replica trick used to study spin-glasses, but with the difference that the number of replicas has a physical meaning as the ratio of two temperatures and can be varied throughout the whole range of real values. The model has interesting phase consequences as a function of varying this ratio and external stimuli, and can be extended to a range of other models. As the basic archetypal model we consider a system of Ising spin neurons (J'i E {-I, I}, i E {I,..., N}, interacting via continuous-valued symmetric interactions, Iij, which themselves evolve in response to the states of the neurons. JijO"iO"j (2) i j and the subscript {Jij} indicates that the {Jij} are to be considered as quenched variables.


Classifying Hand Gestures with a View-Based Distributed Representation

Neural Information Processing Systems

We present a method for learning, tracking, and recognizing human hand gestures recorded by a conventional CCD camera without any special gloves or other sensors. A view-based representation is used to model aspects of the hand relevant to the trained gestures, and is found using an unsupervised clustering technique. We use normalized correlation networks, with dynamic time warping in the temporal domain, as a distance function for unsupervised clustering. Views are computed separably for space and time dimensions; the distributed response of the combination of these units characterizes the input data with a low dimensional representation. A supervised classification stage uses labeled outputs of the spatiotemporal units as training data. Our system can correctly classify gestures in real time with a low-cost image processing accelerator.