Goto

Collaborating Authors

 Asia


Bias, Variance and the Combination of Least Squares Estimators

Neural Information Processing Systems

We consider the effect of combining several least squares estimators on the expected performance of a regression problem. Computing the exact bias and variance curves as a function of the sample size we are able to quantitatively compare the effect of the combination on the bias and variance separately, and thus on the expected error which is the sum of the two. Our exact calculations, demonstrate that the combination of estimators is particularly useful in the case where the data set is small and noisy and the function to be learned is unrealizable. For large data sets the single estimator produces superior results. Finally, we show that by splitting the data set into several independent parts and training each estimator on a different subset, the performance can in some cases be significantly improved.


Learning from queries for maximum information gain in imperfectly learnable problems

Neural Information Processing Systems

In supervised learning, learning from queries rather than from random examples can improve generalization performance significantly. We study the performance of query learning for problems where the student cannot learn the teacher perfectly, which occur frequently in practice. As a prototypical scenario of this kind, we consider a linear perceptron student learning a binary perceptron teacher. Two kinds of queries for maximum information gain, i.e., minimum entropy, are investigated: Minimum student space entropy (MSSE) queries, which are appropriate if the teacher space is unknown, and minimum teacher space entropy (MTSE) queries, which can be used if the teacher space is assumed to be known, but a student of a simpler form has deliberately been chosen. We find that for MSSE queries, the structure of the student space determines the efficacy of query learning, whereas MTSE queries lead to a higher generalization error than random examples, due to a lack of feedback about the progress of the student in the way queries are selected.


Learning Stochastic Perceptrons Under k-Blocking Distributions

Neural Information Processing Systems

I} when the probability distribution that generates the input examples is member of a family that we call k-blocking distributions. Such distributions represent an important step beyond the case where each input variable is statistically independent since the 2k-blocking family contains all the Markov distributions of order k. By stochastic percept ron we mean a perceptron which, upon presentation of input vector x, outputs 1 with probability fCLJi WiXi - B).


Stochastic Dynamics of Three-State Neural Networks

Neural Information Processing Systems

We present here an analysis of the stochastic neurodynamics of a neural network composed of three-state neurons described by a master equation. An outer-product representation of the master equation is employed. In this representation, an extension of the analysis from two to three-state neurons is easily performed. We apply this formalism with approximation schemes to a simple three-state network and compare the results with Monte Carlo simulations.


Neural Network Ensembles, Cross Validation, and Active Learning

Neural Information Processing Systems

It is well known that a combination of many different predictors can improve predictions. In the neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (in regression), but one can also use a weighted combination of the networks.


A Neural Model of Delusions and Hallucinations in Schizophrenia

Neural Information Processing Systems

We implement and study a computational model of Stevens' [19921 theory of the pathogenesis of schizophrenia. This theory hypothesizes that the onset of schizophrenia is associated with reactive synaptic regeneration occurring in brain regions receiving degenerating temporal lobe projections. Concentrating on one such area, the frontal cortex, we model a frontal module as an associative memory neural network whose input synapses represent incoming temporal projections. We analyze how, in the face of weakened external input projections, compensatory strengthening of internal synaptic connections and increased noise levels can maintain memory capacities (which are generally preserved in schizophrenia). However, These compensatory changes adversely lead to spontaneous, biased retrieval of stored memories, which corresponds to the occurrence of schizophrenic delusions and hallucinations without any apparent external trigger, and for their tendency to concentrate on just few central themes. Our results explain why these symptoms tend to wane as schizophrenia progresses, and why delayed therapeutical intervention leads to a much slower response.


Reinforcement Learning Predicts the Site of Plasticity for Auditory Remapping in the Barn Owl

Neural Information Processing Systems

In young barn owls raised with optical prisms over their eyes, these auditory maps are shifted to stay in register with the visual map, suggesting that the visual input imposes a frame of reference on the auditory maps. However, the optic tectum, the first site of convergence of visual with auditory information, is not the site of plasticity for the shift of the auditory maps; the plasticity occurs instead in the inferior colliculus, which contains an auditory map and projects into the optic tectum. We explored a model of the owl remapping in which a global reinforcement signal whose delivery is controlled by visual foveation. A hebb learning rule gated by reinforcement learned to appropriately adjust auditory maps. In addition, reinforcement learning preferentially adjusted the weights in the inferior colliculus, as in the owl brain, even though the weights were allowed to change throughout the auditory system. This observation raises the possibility that the site of learning does not have to be genetically specified, but could be determined by how the learning procedure interacts with the network architecture.


Anatomical origin and computational role of diversity in the response properties of cortical neurons

Neural Information Processing Systems

Our results show that maximal diversity of neuronal response properties is attained when the ratio of dendritic and axonal arbor sizes is equal to 1, a value found in many cortical areas and across species (Lund et al., 1993; Malach, 1994). Maximization of diversity also leads to better performance in systems of receptive fields implementing steerablejshiftable filters, which may be necessary for generating the seemingly continuous range of orientation selectivity found in VI, and in ma.tching spatially distributed signals. This cortical organization principle may, therefore, have the double advantage of accounting for the formation of the cortical columns and the associated patchy projection patterns, and of explaining how systems of receptive fields can support functions such as the generation of precise response tuning from imprecise distributed inputs, and the matching of distributed signals, a problem that arises in visual tasks such as stereopsis, motion processing, and recognition.


A Novel Reinforcement Model of Birdsong Vocalization Learning

Neural Information Processing Systems

Songbirds learn to imitate a tutor song through auditory and motor learning. We have developed a theoretical framework for song learning that accounts for response properties of neurons that have been observed in many of the nuclei that are involved in song learning. Specifically, we suggest that the anteriorforebrain pathway, which is not needed for song production in the adult but is essential for song acquisition, provides synaptic perturbations and adaptive evaluations for syllable vocalization learning. A computer model based on reinforcement learning was constructed that could replicate a real zebra finch song with 90% accuracy based on a spectrographic measure. The second generation of the birdsong model replicated the tutor song with 96% accuracy.