Goto

Collaborating Authors

 Country


Robust Reinforcement Learning in Motion Planning

Neural Information Processing Systems

While exploring to find better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, or even catastrophic, results, often modeled in terms of reaching'failure' states of the agent's environment. This paper presents a method that uses domain knowledge to reduce the number of failures during exploration. This method formulates the set of actions from which the RL agent composes a control policy to ensure that exploration is conducted in a policy space that excludes most of the unacceptable policies. The resulting action set has a more abstract relationship to the task being solved than is common in many applications of RL. Although the cost of this added safety is that learning may result in a suboptimal solution, we argue that this is an appropriate tradeoff in many problems. We illustrate this method in the domain of motion planning. "'This work was done while the first author was finishing his Ph.D in computer science at the University of Massachusetts, Amherst.


Exploiting Chaos to Control the Future

Neural Information Processing Systems

Recently, Ott, Grebogi and Yorke (OGY) [6] found an effective method to control chaotic systems to unstable fixed points by using only small control forces; however, OGY's method is based on and limited to a linear theory and requires considerable knowledge of the dynamics of the system to be controlled. In this paper we use two radial basis function networks: one as a model of an unknown plant and the other as the controller. The controller is trained with a recurrent learning algorithm to minimize a novel objective function such that the controller can locate an unstable fixed point and drive the system into the fixed point with no a priori knowledge of the system dynamics. Our results indicate that the neural controller offers many advantages over OGY's technique.


Synchronization, oscillations, and 1/f noise in networks of spiking neurons

Neural Information Processing Systems

The model consists of a two-dimensional sheet of leaky integrateand-fire neurons with feedback connectivity consisting of local excitation and surround inhibition. Each neuron is independently driven by homogeneous external noise. Spontaneous symmetry breaking occurs, resulting in the formation of "hotspots" of activity in the network. These localized patterns of excitation appear as clusters that coalesce, disintegrate, or fluctuate in size while simultaneously moving in a random walk constrained by the interaction with other clusters. The emergent cross-correlation functions have a dual structure, with a sharp peak around zero on top of a much broader hill.


An Analog VLSI Model of Central Pattern Generation in the Leech

Neural Information Processing Systems

The biological network is small and relatively well understood, and the silicon model can therefore span three levels of organization in the leech nervous system (neuron, ganglion, system); it represents one of the first comprehensive models of leech swimming operating in real-time. The circuit employs biophysically motivated analog neurons networked to form multiple biologically inspired silicon ganglia. These ganglia are coupled using known interganglionic connections. Thus the model retains the flavor of its biological counterpart, and though simplified, the output of the silicon circuit is similar to the output of the leech swim central pattern generator. The model operates on the same time-and spatial-scale as the leech nervous system and will provide an excellent platform with which to explore real-time adaptive locomotion in the leech and other "simple" invertebrate nervous systems.


Optimal Unsupervised Motor Learning Predicts the Internal Representation of Barn Owl Head Movements

Neural Information Processing Systems

This implies the existence of a set of orthogonal internal coordinates that are related to meaningful coordinates of the external world. No coherent computational theory has yet been proposed to explain this finding. I have proposed a simple model which provides a framework for a theory of low-level motor learning. I show that the theory predicts the observed microstimulation results in the barn owl. The model rests on the concept of "Optimal U n supervised Motor Learning", which provides a set of criteria that predict optimal internal representations. I describe two iterative Neural Network algorithms which find the optimal solution and demonstrate possible mechanisms for the development of internal representations in animals. 1 INTRODUCTION In the sensory domain, many algorithms for unsupervised learning have been proposed. These algorithms learn depending on statistical properties of the input data, and often can be used to find useful "intermediate" sensory representations


Foraging in an Uncertain Environment Using Predictive Hebbian Learning

Neural Information Processing Systems

Survival is enhanced by an ability to predict the availability of food, the likelihood of predators, and the presence of mates. We present a concrete model that uses diffuse neurotransmitter systems to implement a predictive version of a Hebb learning rule embedded in a neural architecture based on anatomical and physiological studies on bees. The model captured the strategies seen in the behavior of bees and a number of other animals when foraging in an uncertain environment. The predictive model suggests a unified way in which neuromodulatory influences can be used to bias actions and control synaptic plasticity. Successful predictions enhance adaptive behavior by allowing organisms to prepare for future actions, rewards, or punishments. Moreover, it is possible to improve upon behavioral choices if the consequences of executing different actions can be reliably predicted. Although classical and instrumental conditioning results from the psychological literature [1] demonstrate that the vertebrate brain is capable of reliable prediction, how these predictions are computed in brains is not yet known. The brains of vertebrates and invertebrates possess small nuclei which project axons throughout large expanses of target tissue and deliver various neurotransmitters such as dopamine, norepinephrine, and acetylcholine [4]. The activity in these systems may report on reinforcing stimuli in the world or may reflect an expectation of future reward [5, 6,7,8].


A Hodgkin-Huxley Type Neuron Model That Learns Slow Non-Spike Oscillation

Neural Information Processing Systems

A gradient descent algorithm for parameter estimation which is similar to those used for continuous-time recurrent neural networks was derived for Hodgkin-Huxley type neuron models. Using membrane potential trajectories as targets, the parameters (maximal conductances, thresholds and slopes of activation curves, time constants) were successfully estimated. The algorithm was applied to modeling slow non-spike oscillation of an identified neuron in the lobster stomatogastric ganglion. A model with three ionic currents was trained with experimental data. It revealed a novel role of A-current for slow oscillation below -50 mY. 1 INTRODUCTION Conductance-based neuron models, first formulated by Hodgkin and Huxley [10], are commonly used for describing biophysical mechanisms underlying neuronal behavior.


An Analog VLSI Saccadic Eye Movement System

Neural Information Processing Systems

In an effort to understand saccadic eye movements and their relation to visual attention and other forms of eye movements, we - in collaboration with a number of other laboratories - are carrying out a large-scale effort to design and build a complete primate oculomotor system using analog CMOS VLSI technology. Using this technology, a low power, compact, multi-chip system has been built which works in real-time using real-world visual inputs. We describe in this paper the performance of an early version of such a system including a 1-D array of photoreceptors mimicking the retina, a circuit computing the mean location of activity representing the superior colliculus, a saccadic burst generator, and a one degree-of-freedom rotational platform which models the dynamic properties of the primate oculomotor plant. 1 Introduction When we look around our environment, we move our eyes to center and stabilize objects of interest onto our fovea. In order to achieve this, our eyes move in quick jumps with short pauses in between. These quick jumps (up to 750 deg/sec in humans) are known as saccades and are seen in both exploratory eye movements and as reflexive eye movements in response to sudden visual, auditory, or somatosensory stimuli. Since the intent of the saccade is to bring new objects of interest onto the fovea, it can be considered a primitive attentional mechanism.


Bayesian Modeling and Classification of Neural Signals

Neural Information Processing Systems

Signal processing and classification algorithms often have limited applicability resulting from an inaccurate model of the signal's underlying structure. We present here an efficient, Bayesian algorithm for modeling a signal composed of the superposition of brief, Poisson-distributed functions. This methodology is applied to the specific problem of modeling and classifying extracellular neural waveforms which are composed of a superposition of an unknown number of action potentials CAPs). Previous approaches have had limited success due largely to the problems of determining the spike shapes, deciding how many are shapes distinct, and decomposing overlapping APs. A Bayesian solution to each of these problems is obtained by inferring a probabilistic model of the waveform. This approach quantifies the uncertainty of the form and number of the inferred AP shapes and is used to obtain an efficient method for decomposing complex overlaps. This algorithm can extract many times more information than previous methods and facilitates the extracellular investigation of neuronal classes and of interactions within neuronal circuits.


Directional Hearing by the Mauthner System

Neural Information Processing Systems

The most prominent feature of this network is the pair of large Mauthner cells whose axons cross the midline and descend down the spinal cord to synapse on primary motoneurons. The Mauthner system also includes inhibitory neurons, the PHP cells, which have a unique and intense field effect inhibition at the spikeinitiating zone of the Mauthner cells (Faber and Korn, 1978). The Mauthner system is part of the full brainstem escape network which also includes two pairs of cells homologous to the Mauthner cell and other populations of reticulospinal neurons. With this network fish initiate escapes only from appropriate stimuli, turn away from the offending stimulus, and do so very rapidly with a latency around 15 msec in goldfish. The Mauthner cells play an important role in these functions.