Country
Hyperparameters Evidence and Generalisation for an Unrealisable Rule
Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trainedand tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable because the student can never model it without error. Our model allows us to interpolate between the known case of a linear teacher, and an unrealisable, nonlinearteacher. A comparison of the hyperparameters which maximise the evidence with those that optimise the performance measuresreveals that, in the nonlinear case, the evidence procedure is a misleading guide to optimising performance. Finally, we explore the extent to which the evidence procedure is unreliable and find that, despite being sub-optimal, in some circumstances it might be a useful method for fixing the hyperparameters. 1 INTRODUCTION The analysis of supervised learning or learning from examples is a major field of research within neural networks.
Higher Order Statistical Decorrelation without Information Loss
Deco, Gustavo, Brauer, Wilfried
A neural network learning paradigm based on information theory is proposed asa way to perform in an unsupervised fashion, redundancy reduction among the elements of the output layer without loss of information fromthe sensory input. The model developed performs nonlinear decorrelation up to higher orders of the cumulant tensors and results in probabilistically independent components of the output layer. This means that we don't need to assume Gaussian distribution neither at the input nor at the output. The theory presented is related to the unsupervised-learning theoryof Barlow, which proposes redundancy reduction as the goal of cognition. When nonlinear units are used nonlinear principal componentanalysis is obtained.
Limits on Learning Machine Accuracy Imposed by Data Quality
Cortes, Corinna, Jackel, L. D., Chiang, Wan-Ping
Random errors and insufficiencies in databases limit the performance ofany classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance ofclassifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.
Neural Network Ensembles, Cross Validation, and Active Learning
Krogh, Anders, Vedelsby, Jesper
It is well known that a combination of many different predictors can improve predictions. Inthe neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (inregression), but one can also use a weighted combination of the networks.
From Data Distributions to Regularization in Invariant Learning
For unbiased models the regulatizer reducesto the intuitive form that penalizes the mean squared difference between the network output for transformed and untransformed inputs - i.e. the error in satisfying the desired invariance. In general the regularizer includes a term that measures correlations between the error in fitting the data, and the error in satisfying the desired inva.riance. For infinitesimal transformations, the regularizer is equivalent (up to terms linear in the variance of the transformation parameters) to the tangent prop form given by Simard et a1.
Grouping Components of Three-Dimensional Moving Objects in Area MST of Visual Cortex
Zemel, Richard S., Sejnowski, Terrence J.
A number of studies have described neurons in the dorsal part of the medial superior temporal (MSTd) monkey cortex that respond best to large expanding/contracting, rotating, or shifting patterns (Tanaka et al., 1986; Duffy & Wurtz, 1991a). Recently Graziano et al. (1994) found that MSTd cell responses correspond to a point in a multidimensional space of spiral motions, where the dimensions are these motion types. Combinationsof these motions are generated as an animal moves through its environment, whichsuggests that area MSTd could playa role in optical flow analysis. When an observer moves through a static environment, a singularity in the flow field known as the focus of expansion may be used to determine the direction of heading (Gibson, 1950; Warren & Hannon, 1988). Previous computational models of MSTd (Lappe & Rauschecker, 1993; Perrone & Stone, 1994) have shown how navigational information related to heading may be encoded by these cells.
A Neural Model of Delusions and Hallucinations in Schizophrenia
Ruppin, Eytan, Reggia, James A., Horn, David
We implement and study a computational model of Stevens' [19921 theory of the pathogenesis of schizophrenia. This theory hypothesizes thatthe onset of schizophrenia is associated with reactive synaptic regeneration occurring in brain regions receiving degenerating temporallobe projections. Concentrating on one such area, the frontal cortex, we model a frontal module as an associative memory neural network whose input synapses represent incoming temporal projections. We analyze how, in the face of weakened external input projections, compensatory strengthening of internal synaptic connections and increased noise levels can maintain memory capacities(which are generally preserved in schizophrenia). However, These compensatory changes adversely lead to spontaneous, biasedretrieval of stored memories, which corresponds to the occurrence of schizophrenic delusions and hallucinations without anyapparent external trigger, and for their tendency to concentrate onjust few central themes. Our results explain why these symptoms tend to wane as schizophrenia progresses, and why delayed therapeuticalintervention leads to a much slower response.