Country
Convergence and Pattern-Stabilization in the Boltzmann Machine
The Boltzmann Machine has been introduced as a means to perform global optimization for multimodal objective functions using the principles of simulated annealing. In this paper we consider its utility as a spurious-free content-addressable memory, and provide bounds on its performance in this context. We show how to exploit the machine's ability to escape local minima, in order to use it, at a constant temperature, for unambiguous associative pattern-retrieval in noisy environments. An association rule, which creates a sphere of influence around each stored pattern, is used along with the Machine's dynamics to match the machine's noisy input with one of the pre-stored patterns. Spurious fIxed points, whose regions of attraction are not recognized by the rule, are skipped, due to the Machine's fInite probability to escape from any state. The results apply to the Boltzmann machine and to the asynchronous net of binary threshold elements (Hopfield model'). They provide the network designer with worst-case and best-case bounds for the network's performance, and allow polynomial-time tradeoff studies of design parameters.
Mapping Classifier Systems Into Neural Networks
Classifier systems are machine learning systems incotporating a genetic algorithm asthe learning mechanism. Although they respond to inputs that neural networks can respond to, their internal structure, representation fonnalisms, and learning mechanisms differ marlcedly from those employed by neural network researchers inthe same sorts of domains. As a result, one might conclude that these two types of machine learning fonnalisms are intrinsically different. This is one of two papers that, taken together, prove instead that classifier systems and neural networks are equivalent. In this paper, half of the equivalence is demonstrated through the description of a transfonnation procedure that will map classifier systems into neural networks that are isomotphic in behavior. Several alterations on the commonly-used paradigms employed by neural networlc researchers are required in order to make the transfonnation worlc.
Fast Learning in Multi-Resolution Hierarchies
A variety of approaches to adaptive information processing have been developed by workers in disparate disciplines. These include the large body of literature on approximation and interpolation techniques (curve and surface fitting), the linear, real-time adaptive signal processing systems (such as the adaptive linear combiner and the Kalman filter), and most recently, the reincarnation of nonlinear neural network models such as the multilayer perceptron. Each of these methods has its strengths and weaknesses. The curve and surface fitting techniques are excellent for off-line data analysis, but are typically not formulated withreal-time applications in mind. The linear techniques of adaptive signal processing and adaptive control are well-characterized, but are limited to applications forwhich linear descriptions are appropriate. Finally, neural network learning models such as back propagation have proven extremely versatile at learning a wide variety of nonlinear mappings, but tend to be very slow computationally and are not yet well characterized.
Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks
Encouraged by these results we wanted to explore the question, how we might expand on these models to make them useful for the design of speech recognition systems. A problem that emerges as we attempt to apply neural network models to the full speech recognition problem is the problem of scaling. Simply extending neural networks to ever larger structures and retraining them as one monolithic net quickly exceeds the capabilities of the fastest and largest supercomputers. The search complexity of finding a good solutions in a huge space of possible network configurations also soon assumes unmanageable proportions. Moreover, having to decide on all possible classes for recognition ahead of time as well as collecting sufficient data to train such a large monolithic network is impractical to say the least. In an effort to extend our models from small recognition tasks to large scale speech recognition systems, we must therefore explore modularity and incremental learning as design strategies to break up a large learning task into smaller subtasks. Breaking up a large task into subtasks to be tackled by individual black boxes interconnected in ad hoc arrangements, on the other hand, would mean to abandon one of the most attractive aspects of connectionism: the ability to perform complex constraint satisfaction in a massively parallel and interconnected fashion, in view of an overall optimal perfonnance goal.
Does the Neuron "Learn" like the Synapse?
An improved learning paradigm that offers a significant reduction in computation timeduring the supervised learning phase is described. It is based on extending the role that the neuron plays in artificial neural systems. Prior work has regarded the neuron as a strictly passive, nonlinear processing element, and the synapse on the other hand as the primary source of information processing and knowledge retention. In this work, the role of the neuron is extended insofar as allowing itsparameters to adaptively participate in the learning phase. The temperature of the sigmoid function is an example of such a parameter.
Modeling Small Oscillating Biological Networks in Analog VLSI
Ryckebusch, Sylvie, Bower, James M., Mead, Carver
We have used analog VLSI technology to model a class of small oscillating biologicalneural circuits known as central pattern generators (CPG). These circuits generate rhythmic patterns of activity which drive locomotor behaviour in the animal. We have designed, fabricated, and tested a model neuron circuit which relies on many of the same mechanisms as a biological central pattern generator neuron, such as delays and internal feedback. We show that this neuron can be used to build several small circuits based on known biological CPG circuits, and that these circuits produce patterns of output which are very similar to the observed biological patterns. To date, researchers in applied neural networks have tended to focus on mammalian systemsas the primary source of potentially useful biological information.
A Massively Parallel Self-Tuning Context-Free Parser
ABSTRACT The Parsing and Learning System(PALS) is a massively parallel self-tuning context-free parser. It is capable of parsing sentences of unbounded length mainly due to its parse-tree representation scheme. The system is capable of improving its parsing performance through the presentation of training examples. INTRODUCTION Recent PDP research[Rumelhart et al.- 1986; Feldman and Ballard, 1982; Lippmann, 1987] involving natural language processtng[Fanty, 1988; Selman, 1985; Waltz and Pollack, 1985] have unrealistically restricted sentences to a fixed length. A solution to this problem was presented in the system CONPARSE[Charniak and Santos.
Linear Learning: Landscapes and Algorithms
In particular we examine what happens when the ntunber of layers is large or when the connectivity between layers is local and investigate some of the properties of an autoassociative algorithm. Notation will be as in [1] where additional motivations and references can be found. It is usual to criticize linear networks because "linear functions do not compute" and because several layers can always be reduced to one by the proper multiplication of matrices. However this is not the point of view adopted here. It is assumed that the architecture of the network is given (and could perhaps depend on external constraints) and the purpose is to understand what happens during the learning phase, what strategies are adopted by a synaptic weights modifying algorithm, ... [see also Cottrell et al. (1988) for an example of an application andthe work of Linsker (1988) on the emergence of feature detecting units in linear networks}.