On the Non-Existence of a Universal Learning Algorithm for Recurrent Neural Networks
We prove that the so called "loading problem" for (recurrent) neural networks isunsolvable. This extends several results which already demonstrated thattraining and related design problems for neural networks are (at least) NPcomplete. Our result also implies that it is impossible to find or to formulate a universal training algorithm, which for any neural networkarchitecture could determine a correct set of weights. For the simple proof of this, we will just show that the loading problem is equivalent to "Hilbert's tenth problem" which is known to be unsolvable.
Complexity Issues in Neural Computation and Learning
Roychowdhury, V. P., Siu, K.-Y.
The general goal of this workshop was to bring t.ogether researchers working toward developing a theoretical framework for the analysis and design of neural networks. The t.echnical focus of the workshop was to address recent. The primary topics addressed the following three areas: 1) Computational complexityissues in neural networks, 2) Complexity issues in learning, and 3) Convergence and numerical properties of learning algorit.hms. Such st.udies, in t.urn, have generated considerable research interest. A similar development can be observed in t.he area of learning as well: Techniques primarily developed in the classical theory of learning are being applied to understand t.he generalization and learning characteristics of neural networks.
Optimal Unsupervised Motor Learning Predicts the Internal Representation of Barn Owl Head Movements
Thisimplies the existence of a set of orthogonal internal coordinates thatare related to meaningful coordinates of the external world. No coherent computational theory has yet been proposed to explain this finding. I have proposed a simple model which provides aframework for a theory of low-level motor learning. I show that the theory predicts the observed microstimulation results in the barn owl. The model rests on the concept of "Optimal Un supervised Motor Learning", which provides a set of criteria that predict optimal internal representations. I describe two iterative Neural Network algorithms which find the optimal solution and demonstrate possible mechanisms for the development of internal representations in animals. 1 INTRODUCTION In the sensory domain, many algorithms for unsupervised learning have been proposed. Thesealgorithms learn depending on statistical properties of the input data, and often can be used to find useful "intermediate" sensory representations 614 Bam Owl Head Movements 615
Robot Learning: Exploration and Continuous Domains
David A. Cohn MIT Dept. of Brain and Cognitive Sciences Cambridge, MA 02139 The goal of this workshop was to discuss two major issues: efficient exploration of a learner's state space, and learning in continuous domains. The common themes that emerged in presentations and in discussion were the importance of choosing one'sdomain assumptions carefully, mixing controllers/strategies, avoidance of catastrophic failure, new approaches with difficulties with reinforcement learning, and the importance of task transfer. He suggested that neither "fewer assumptions are better" nor "more assumptions are better" is a tenable position, and that we should strive to find and use standard sets of assumptions. With no such commonality, comparison of techniques and results is meaningless. Under Moore's guidance, the group discussed the possibility of designing an algorithm which used a number of well-chosen assumption sets and switched between them according to their empirical validity.
What Does the Hippocampus Compute?: A Precis of the 1993 NIPS Workshop
Computational models of the hippocampal-region provide an important method for understanding the functional role of this brain system in learning and memory. The presentations in this workshop focused on how modeling can lead to a unified understanding of the interplay among hippocampal physiology, anatomy, and behavior. One approach can be characterized as "top-down" analyses of the neuropsychology of memory, drawing upon brain-lesion studies in animals and humans. Other models take a "bottom-up" approach, seeking to infer emergent computational and functional properties from detailed analyses of circuit connectivity and physiology (see Gluck & Granger, 1993, for a review). Among the issues discussed were: (1) integration of physiological and behavioral theories of hippocampal function, (2) similarities and differences between animal and human studies, (3) representational vs. temporal properties of hippocampaldependent behaviors,(4) rapid vs. incremental learning, (5) mUltiple vs. unitary memory systems, (5) spatial navigation and memory, and (6) hippocampal interaction with other brain systems.
Bayesian Backprop in Action: Pruning, Committees, Error Bars and an Application to Spectroscopy
MacKay's Bayesian framework for backpropagation is conceptually appealing as well as practical. It automatically adjusts the weight decay parameters during training, and computes the evidence for each trained network. The evidence is proportional to our belief in the model. In this paper, the framework is extended to pruned nets, leading to an Ockham Factor for "tuning the architecture to the data". A committee of networks, selected by their high evidence, is a natural Bayesian construction.
Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming
Dynamic programming provides a methodology to plan trajectories and design controllers andestimators for nonlinear systems. However, general dynamic programming is computationally intractable. We have developed procedures that allow more complex planning problems to be solved. We have modified the State Increment Dynamic Programming approach of Larson (1968) in several ways: 1. In State Increment DP, a constant action is integrated to form a trajectory segment from the center of a cell to its boundary. We use second order local trajectory optimization (Differential Dynamic Programming) to generate an optimal trajectory and form an optimal policy in a tube surrounding the optimal trajectory within a cell. The trajectory segment and local policy are globally optimal, up to the resolution of the representation of the value function on the boundary of the cell.
Adaptive knot Placement for Nonparametric Regression
Najafi, Hossein L., Cherkassky, Vladimir
We show how an "Elman" network architecture, constructed from recurrently connected oscillatory associative memory network modules, canemploy selective "attentional" control of synchronization to direct the flow of communication and computation within the architecture to solve a grammatical inference problem. Previously we have shown how the discrete time "Elman" network algorithm can be implemented in a network completely described by continuous ordinary differential equations. The time steps (machine cycles)of the system are implemented by rhythmic variation (clocking) of a bifurcation parameter. In this architecture, oscillation amplitudecodes the information content or activity of a module (unit), whereas phase and frequency are used to "softwire" the network. Only synchronized modules communicate by exchanging amplitudeinformation; the activity of non-resonating modules contributes incoherent crosstalk noise. Attentional control is modeled as a special subset of the hidden modules with ouputs which affect the resonant frequencies of other hidden modules. They control synchrony among the other modules anddirect the flow of computation (attention) to effect transitions betweentwo subgraphs of a thirteen state automaton which the system emulates to generate a Reber grammar. The internal crosstalk noise is used to drive the required random transitions of the automaton.
Mixtures of Controllers for Jump Linear and Non-Linear Plants
Cacciatore, Timothy W., Nowlan, Steven J.
To control such complex systems it is computationally moreefficient to decompose the problem into smaller subtasks, with different control strategies for different operating points. When detailed information about the plant is available, gain scheduling has proven a successful method for designing a global control (Shamma and Athans, 1992). The system is partitioned by choosing several operating points and a linear model for each operating point. A controller is designed for each linear model and a method for interpolating or'scheduling' the gains of the controllers is chosen. The control problem becomes even more challenging when the system to be controlled isnon-stationary, and the mode of the system is not explicitly observable.