Goto

Collaborating Authors

 Information Technology


Statistical Mechanics of Learning in a Large Committee Machine

Neural Information Processing Systems

We use statistical mechanics to study generalization in large committee machines.For an architecture with nonoverlapping receptive fields a replica calculation yields the generalization error in the limit of a large number of hidden units.


The Power of Approximating: a Comparison of Activation Functions

Neural Information Processing Systems

We compare activation functions in terms of the approximation power of their feedforward nets. We consider the case of analog as well as boolean input. 1 Introduction


Deriving Receptive Fields Using an Optimal Encoding Criterion

Neural Information Processing Systems

In unsupervised network learning, the development of the connection weights is influenced by statistical properties of the ensemble of input vectors, rather than by the degree of mismatch between the network's output and some'desired' output. An implicit goal of such learning is that the network should transform the input so that salient features present in the input are represented at the output in a 953 954 Linsker more useful form. This is often done by reducing the input dimensionality in a way that preserves the high-variance components of the input (e.g., principal component analysis, Kohonen feature maps). The principle of maximum information preservation ('infomax') is an unsupervised learning strategy that states (Linsker 1988): From a set of allowed input-output mappings (e.g., parametrized by the connection weights), choose a mapping that maximizes the (ensemble-averaged) Shannon information that the output vector conveys about the input vector, in the presence of noise.


Neural Network Model Selection Using Asymptotic Jackknife Estimator and Cross-Validation Method

Neural Information Processing Systems

Two theorems and a lemma are presented about the use of jackknife estimator andthe cross-validation method for model selection. Theorem 1 gives the asymptotic form for the jackknife estimator. Combined with the model selection criterion, this asymptotic form can be used to obtain the fit of a model. The model selection criterion we used is the negative of the average predictive likehood, the choice of which is based on the idea of the cross-validation method. Lemma 1 provides a formula for further exploration ofthe asymptotics of the model selection criterion. Theorem 2 gives an asymptotic form of the model selection criterion for the regression case, when the parameters optimization criterion has a penalty term. Theorem 2 also proves the asymptotic equivalence of Moody's model selection criterion (Moody,1992) and the cross-validation method, when the distance measure between response y and regression function takes the form of a squared difference. 1 INTRODUCTION Selecting a model for a specified problem is the key to generalization based on the training data set.


Learning Control Under Extreme Uncertainty

Neural Information Processing Systems

A peg-in-hole insertion task is used as an example to illustrate the utility of direct associative reinforcement learning methods for learning control under real-world conditions of uncertainty and noise. Task complexity due to the use of an unchamfered hole and a clearance of less than 0.2mm is compounded by the presence of positional uncertainty of magnitude exceeding 10 to 50 times the clearance. Despite this extreme degree of uncertainty, our results indicate that direct reinforcement learning can be used to learn a robust reactive control strategy that results in skillful peg-in-hole insertions.


Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping

Neural Information Processing Systems

We present a new algorithm, Prioritized Sweeping, for efficient prediction and control of stochastic Markov systems. Incremental learning methods such as Temporal Differencing and Q-Iearning have fast real time performance. Classicalmethods are slower, but more accurate, because they make full use of the observations. Prioritized Sweeping aims for the best of both worlds. It uses all previous experiences both to prioritize important dynamicprogramming sweeps and to guide the exploration of statespace.


Visual Motion Computation in Analog VLSI Using Pulses

Neural Information Processing Systems

The real time computation of motion from real images using a single chip with integrated sensors is a hard problem. Wepresent two analog VLSI schemes that use pulse domain neuromorphic circuits to compute motion. Pulses of variable width, rather than graded potentials, represent a natural medium for evaluating temporal relationships.


Filter Selection Model for Generating Visual Motion Signals

Neural Information Processing Systems

We present a model of how MT cells aggregate responses from VI to form such a velocity representation. Two different sets of units, with local receptive fields, receive inputs from motion energy filters. One set of units forms estimates of local motion, while the second set computes the utility of these estimates. Outputs from this second set of units "gate" the outputs from the first set through a gain control mechanism. This active process for selecting only a subset of local motion responses to integrate into more global responses distinguishes our model from previous models of velocity estimation.


Time Warping Invariant Neural Networks

Neural Information Processing Systems

We proposed a model of Time Warping Invariant Neural Networks (TWINN) to handle the time warped continuous signals. Although TWINN is a simple modification ofwell known recurrent neural network, analysis has shown that TWINN completely removestime warping and is able to handle difficult classification problem. It is also shown that TWINN has certain advantages over the current available sequential processing schemes: Dynamic Programming(DP)[I], Hidden Markov Model( HMM)[2], Time Delayed Neural Networks(TDNN) [3] and Neural Network Finite Automata(NNFA)[4]. Wealso analyzed the time continuity employed in TWINN and pointed out that this kind of structure can memorize longer input history compared with Neural Network FiniteAutomata (NNFA). This may help to understand the well accepted fact that for learning grammatical reference with NNFA one had to start with very short strings in training set. The numerical example we used is a trajectory classification problem. This problem, making a feature of variable sampling rates, having internal states, continuous dynamics,heavily time-warped data and deformed phase space trajectories, is shown to be difficult to other schemes. With TWINN this problem has been learned in 100 iterations. For benchmark we also trained the exact same problem with TDNN and completely failed as expected.


Holographic Recurrent Networks

Neural Information Processing Systems

Holographic Recurrent Networks (HRNs) are recurrent networks which incorporate associative memory techniques for storing sequential structure.HRNs can be easily and quickly trained using gradient descent techniques to generate sequences of discrete outputs andtrajectories through continuous spaee. The performance of HRNs is found to be superior to that of ordinary recurrent networks onthese sequence generation tasks. 1 INTRODUCTION The representation and processing of data with complex structure in neural networks remains a challenge. In a previous paper [Plate, 1991b] I described Holographic Reduced Representations(HRRs) which use circular-convolution associative-memory to embody sequential and recursive structure in fixed-width distributed representations. Thispaper introduces Holographic Recurrent Networks (HRNs), which are recurrent nets that incorporate these techniques for generating sequences of symbols or trajectories through continuous space.