Goto

Collaborating Authors

 Genre


An experimental comparison of recurrent neural networks

Neural Information Processing Systems

Many different discrete-time recurrent neural network architectures havebeen proposed. However, there has been virtually no effort to compare these arch:tectures experimentally. In this paper we review and categorize many of these architectures and compare how they perform on various classes of simple problems including grammatical inference and nonlinear system identification.


A Connectionist Technique for Accelerated Textual Input: Letting a Network Do the Typing

Neural Information Processing Systems

Each year people spend a huge amount oftime typing. The text people type typically contains a tremendous amount of redundancy due to predictable word usage patterns and the text's structure. This paper describes a neural network system call AutoTypist that monitors a person's typing and predicts what will be entered next. AutoTypist displays the most likely subsequent word to the typist, who can accept it with a single keystroke, instead of typing it in its entirety. The multi-layer perceptron at the heart of Auto'JYpist adapts its predictions of likely subsequent text to the user's word usage pattern, and to the characteristics of the text currently being typed. Increases in typing speed of 2-3% when typing English prose and 10-20% when typing C code have been demonstrated using the system, suggesting a potential time savings of more than 20 hours per user per year. In addition to increasing typing speed, AutoTypist reduces the number of keystrokes a user must type by a similar amount (2-3% for English, 10-20% for computer programs). This keystroke savings has the potential to significantly reduce the frequency and severity of repeated stress injuries caused by typing, which are the most common injury suffered in today's office environment.


An Integrated Architecture of Adaptive Neural Network Control for Dynamic Systems

Neural Information Processing Systems

Most of the recent emphasis in the neural network control field has no error feedback as the control input, which rises the lack of adaptation problem. The integrated architecture in this paper combines feed forward control and error feedback adaptive control using neural networks. The paper reveals the different internal functionality of these two kinds of neural network controllers for certain input styles, e.g., state feedback and error feedback. With error feedback, neural network controllers learn the slopes or the gains with respect to the error feedback, producing an error driven adaptive control systems. The results demonstrate that the two kinds of control scheme can be combined to realize their individual advantages. Testing with disturbances added to the plant shows good tracking and adaptation with the integrated neural control architecture.


A Study of Parallel Perturbative Gradient Descent

Neural Information Processing Systems

Motivated by difficulties in analog VLSI implementation of back-propagation [Rumelhart et al., 1986] and related algorithms that calculate gradients based on detailed knowledge of the neural network model, there were several similar recent papersproposing to use a parallel [Alspector et al., 1993, Cauwenberghs, 1993, Kirk et al., 1993] or a semi-parallel [Flower and Jabri, 1993] perturbative technique which has the property that it measures (with the physical neural network) rather than calculates the gradient. This technique is closely related to methods of stochastic approximation[Kushner and Clark, 1978] which have been investigated recently by workers in fields other than neural networks.



Learning from queries for maximum information gain in imperfectly learnable problems

Neural Information Processing Systems

In supervised learning, learning from queries rather than from random examples can improve generalization performance significantly. Westudy the performance of query learning for problems where the student cannot learn the teacher perfectly, which occur frequently in practice. As a prototypical scenario of this kind, we consider a linear perceptron student learning a binary perceptron teacher. Two kinds of queries for maximum information gain, i.e., minimum entropy, are investigated: Minimum student space entropy (MSSE)queries, which are appropriate if the teacher space is unknown, and minimum teacher space entropy (MTSE) queries, which can be used if the teacher space is assumed to be known, but a student of a simpler form has deliberately been chosen. We find that for MSSE queries, the structure of the student space determines theefficacy of query learning, whereas MTSE queries lead to a higher generalization error than random examples, due to a lack of feedback about the progress of the student in the way queries are selected.


Catastrophic Interference in Human Motor Learning

Neural Information Processing Systems

Biological sensorimotor systems are not static maps that transform input (sensory information) into output (motor behavior). Evidence frommany lines of research suggests that their representations are plastic, experience-dependent entities. While this plasticity is essential for flexible behavior, it presents the nervous system with difficult organizational challenges. If the sensorimotor system adapts itself to perform well under one set of circumstances, will it then perform poorly when placed in an environment with different demands (negative transfer)? Will a later experience-dependent change undo the benefits of previous learning (catastrophic interference)?


On-line Learning of Dichotomies

Neural Information Processing Systems

The performance of online algorithms for learning dichotomies is studied. In online learning, thenumber of examples P is equivalent to the learning time, since each example is presented only once. The learning curve, or generalization error as a function of P, depends on the schedule at which the learning rate is lowered. For a target that is a perceptron rule, the learning curve of the perceptron algorithm can decrease as fast as p-1,if the schedule is optimized. If the target is not realizable by a perceptron, the perceptron algorithm does not generally converge to the solution with lowest generalization error.


Diffusion of Credit in Markovian Models

Neural Information Processing Systems

This paper studies the problem of diffusion in Markovian models, such as hidden Markov models (HMMs) and how it makes very difficult the task of learning of long-term dependencies in sequences. Using results from Markov chain theory, we show that the problem of diffusion is reduced if the transition probabilities approach 0 or 1. Under this condition, standard HMMs have very limited modeling capabilities, but input/output HMMs can still perform interesting computations.