Technology
Neural Networks with Quadratic VC Dimension
Koiran, Pascal, Sontag, Eduardo D.
A set of labeled training samples is provided, and a network must be obtained which is then expected to correctly classify previously unseen inputs. In this context, a central problem is to estimate the amount of training data needed to guarantee satisfactory learning performance. To study this question, it is necessary to first formalize the notion of learning from examples. One such formalization is based on the paradigm of probably approximately correct (PAC) learning, due to Valiant (1984). In this framework, one starts by fitting some function /, chosen from a predetermined class F, to the given training data. The class F is often called the "hypothesis class", and for purposes of this discussion it will be assumed that the functions in F take binary values {O, I} and are defined on a common domain X.
Learning with ensembles: How overfitting can be useful
AndersKrogh'" NORDITA, Blegdamsvej 17 2100 Copenhagen, Denmark kroghGsanger.ac.uk Abstract We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fitthe training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. Forsmaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance,in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data. 1 INTRODUCTION An ensemble is a collection of a (finite) number of neural networks or other types of predictors that are trained for the same task.
Independent Component Analysis of Electroencephalographic Data
Makeig, Scott, Bell, Anthony J., Jung, Tzyy-Ping, Sejnowski, Terrence J.
Recent efforts to identify EEG sources have focused mostly on verforming spatial segregation and localization of source activity [4]. By applying the leA algorithm of Bell and Sejnowski [1], we attempt to completely separate the twin problems of source identification (What) and source localization (Where). The leA algorithm derives independent sources from highly correlated EEG signals statistically and without regard to the physical location or configuration of the source generators. Rather than modeling the EEG as a unitary output of a multidimensional dynamical system,or as "the roar of the crowd" of independent microscopic generators, we suppose that the EEG is the output of a number of statistically independent but spatially fixed potential-generating systems which may either be spatially restricted or widely distributed.
A Predictive Switching Model of Cerebellar Movement Control
Barto, Andrew G., Houk, James C.
The existence of significant delays in sensorimotor feedback pathways has led several researchers to suggest that the cerebellum might function as a forward model of the motor plant in order to predict the sensory consequences of motor commands before actual feedback is available; e.g., (Ito, 1984; Keeler, 1990; Miall et ai., 1993). While we agree that there are many potential roles for forward models in motor control systems, as discussed, e.g., in (Wolpert et al., 1995), we present a hypothesis about how the cerebellum could participate in regulating movement in the presence of significant feedbackdelays without resorting to a forward model. We show how a very simplified version of the adjustable pattern generator (APG) model being developed by Houk and colleagues (Berthier et al., 1993; Houk et al., 1995) can learn to control endpointpositioning of a nonlinear spring-mass system with significant delays in both afferent and efferent pathways. Although much simpler than a multilink dynamic arm, control of this spring-mass system involves some of the challenges critical in the control of a more realistic motor system and serves to illustrate the principles we propose. Preliminary results appear in (Buckingham et al., 1995).
Temporal coding in the sub-millisecond range: Model of barn owl auditory pathway
Kempter, Richard, Gerstner, Wulfram, Hemmen, J. Leo van, Wagner, Hermann
Binaural coincidence detection is essential for the localization of external sounds and requires auditory signal processing with high temporal precision. We present an integrate-and-fire model of spike processing in the auditory pathway of the barn owl. It is shown that a temporal precision in the microsecond range can be achieved with neuronal time constants which are at least one magnitude longer. An important feature of our model is an unsupervised Hebbian learning rule which leads to a temporal fine tuning of the neuronal connections.
The Geometry of Eye Rotations and Listing's Law
Handzel, Amir A., Flash, Tamar
Variousparameterizations of rotations are related through a unifying mathematical treatment, and transformations between coordinate systems are computed using the Campbell-Baker Hausdorff formula. Next, we describe Listing's law by means of the Lie algebra so(3). This enables us to demonstrate a direct connection to Donders' law, by showing that eye orientations are restricted to the quotient space 80(3)/80(2). The latter is equivalent tothe sphere S2, which is exactly the space of gaze directions. Our analysis provides a mathematical framework for studying the oculomotor system and could also be extended to investigate the geometry of mUlti-joint arm movements.
Information through a Spiking Neuron
Stevens, Charles F., Zador, Anthony M.
While it is generally agreed that neurons transmit information about their synaptic inputs through spike trains, the code by which this information is transmitted is not well understood. An upper bound on the information encoded is obtained by hypothesizing that the precise timing of each spike conveys information. Here we develop a general approach to quantifying the information carried by spike trains under this hypothesis, and apply it to the leaky integrate-and-fire (IF) model of neuronal dynamics. We formulate theproblem in terms of the probability distribution peT) of interspike intervals (ISIs), assuming that spikes are detected with arbitrary but finite temporal resolution. In the absence of added noise, all the variability in the ISIs could encode information, and the information rate is simply the entropy of the lSI distribution, H (T) (-p(T) log2 p(T)}, times the spike rate.
Dynamics of Attention as Near Saddle-Node Bifurcation Behavior
Nakahara, Hiroyuki, Doya, Kenji
Most studies of attention have focused on the selection process of incoming sensory cues (Posner et al., 1980; Koch et al., 1985; Desimone et al., 1995). Emphasis was placed on the phenomena of causing different percepts for the same sensory stimuli. However, the selection of sensory input itself is not the final goal of attention. We consider attention as a means for goal-directed behavior and survival of the animal. In this view, dynamical properties of attention are crucial. While attention has to be maintained long enough to enable robust response to sensory input, it also has to be shifted quickly to a novel cue that is potentially important. Long-term maintenance and quick transition are critical requirements for attention dynamics.
Harmony Networks Do Not Work
Harmony networks have been proposed as a means by which connectionist modelscan perform symbolic computation. Indeed, proponents claim that a harmony network can be built that constructs parse trees for strings in a context free language. This paper shows that harmony networks do not work in the following sense: they construct many outputs that are not valid parse trees. In order to show that the notion of systematicity is compatible with connectionism, Paul Smolensky, Geraldine Legendre and Yoshiro Miyata (Smolensky, Legendre, and Miyata 1992; Smolensky 1993; Smolensky, Legendre, and Miyata 1994) proposed amechanism, "Harmony Theory," by which connectionist models purportedly perform structure sensitive operations without implementing classical algorithms. Harmony theory describes a "harmony network" which, in the course of reaching a stable equilibrium, apparently computes parse trees that are valid according to the rules of a particular context-free grammar.