Goto

Collaborating Authors

 Country


Learning Sequential Tasks by Incrementally Adding Higher Orders

Neural Information Processing Systems

An incremental, higher-order, non-recurrent network combines two properties found to be useful for learning sequential tasks: higherorder connectionsand incremental introduction of new units. The network adds higher orders when needed by adding new units that dynamically modify connection weights. Since the new units modify theweights at the next time-step with information from the previous step, temporal tasks can be learned without the use of feedback, thereby greatly simplifying training. Furthermore, a theoretically unlimitednumber of units can be added to reach into the arbitrarily distant past.


Hidden Markov Models in Molecular Biology: New Algorithms and Applications

Neural Information Processing Systems

Hidden Markov Models (HMMs) can be applied to several important problemsin molecular biology. We introduce a new convergent learning algorithm for HMMs that, unlike the classical Baum-Welch algorithm is smooth and can be applied online or in batch mode, with or without the usual Viterbi most likely path approximation. Left-right HMMs with insertion and deletion states are then trained to represent several protein families including immunoglobulins and kinases. In all cases, the models derived capture all the important statistical properties of the families and can be used efficiently in a number of important tasks such as multiple alignment, motif detection, andclassification.


Network Structuring and Training Using Rule-based Knowledge

Neural Information Processing Systems

We demonstrate in this paper how certain forms of rule-based knowledge can be used to prestructure a neural network of normalized basisfunctions and give a probabilistic interpretation of the network architecture. We describe several ways to assure that rule-based knowledge is preserved during training and present a method for complexity reduction that tries to minimize the number ofrules and the number of conjuncts. After training the refined rules are extracted and analyzed.


Synchronization and Grammatical Inference in an Oscillating Elman Net

Neural Information Processing Systems

We have designed an architecture to span the gap between biophysics andcognitive science to address and explore issues of how a discrete symbol processing system can arise from the continuum, and how complex dynamics like oscillation and synchronization can then be employed in its operation and affect its learning. We show how a discrete-time recurrent "Elman" network architecture can be constructed from recurrently connected oscillatory associative memory modules described by continuous nonlinear ordinary differential equations.The modules can learn connection weights between themselves which will cause the system to evolve under a clocked "machine cycle" by a sequence of transitions of attractors within the modules, much as a digital computer evolves by transitions ofits binary flip-flop attractors. The architecture thus employs theprinciple of "computing with attractors" used by macroscopic systemsfor reliable computation in the presence of noise. We have specifically constructed a system which functions as a finite state automaton that recognizes or generates the infinite set of six symbol strings that are defined by a Reber grammar. It is a symbol processing system, but with analog input and oscillatory subsymbolic representations.The time steps (machine cycles) of the system are implemented by rhythmic variation (clocking) of a bifurcation parameter.This holds input and "context" modules clamped at their attractors while'hidden and output modules change state, then clamps hidden and output states while context modules are released to load those states as the new context for the next cycle of input. Superior noise immunity has been demonstrated for systems with dynamic attractors over systems with static attractors, and synchronization ("binding") between coupled oscillatory attractors in different modules has been shown to be important for effecting reliable transitions.


Learning Fuzzy Rule-Based Neural Networks for Control

Neural Information Processing Systems

First, the membership functions and an initial rule representation are learned; second, the rules are compressed as much as possible using information theory; and finally, a computational networkis constructed to compute the function value. This system is applied to two control examples: learning the truck and trailer backer-upper control system, and learning a cruise control systemfor a radio-controlled model car. 1 Introduction Function approximation is the problem of estimating a function from a set of examples ofits independent variables and function value. If there is prior knowledge of the type of function being learned, a mathematical model of the function can be constructed and the parameters perturbed until the best match is achieved. However, ifthere is no prior knowledge of the function, a model-free system such as a neural network or a fuzzy system may be employed to approximate an arbitrary nonlinear function. A neural network's inherent parallel computation is efficient for speed; however, the information learned is expressed only in the weights of the network. The advantage of fuzzy systems over neural networks is that the information learnedis expressed in terms of linguistic rules. In this paper, we propose a method for learning a complete fuzzy system to approximate example data.


The Computation of Stereo Disparity for Transparent and for Opaque Surfaces

Neural Information Processing Systems

The classical computational model for stereo vision incorporates a uniqueness inhibition constraint to enforce a one-to-one feature match, thereby sacrificing the ability to handle transparency. Critics ofthe model disregard the uniqueness constraint and argue that the smoothness constraint can provide the excitation support required for transparency computation.


Forecasting Demand for Electric Power

Neural Information Processing Systems

Our efforts proceed in the context of a problem suggested by the operational needs of a particular electric utility to make daily forecasts of short-term load or demand. Forecasts are made at midday (1 p.m.) on a weekday t ( Monday - Thursday), for the next evening peak e(t) (occuring usually about 8 p.m. in the winter), the daily minimum d(t


Generic Analog Neural Computation - The EPSILON Chip

Neural Information Processing Systems

An analog CMOS VLSI neural processing chip has been designed and fabricated. Thedevice employs "pulse-stream" neural state signalljng, and is capable of computing some 360 million synaptic connections per secood.


Adaptive Stimulus Representations: A Computational Theory of Hippocampal-Region Function

Neural Information Processing Systems

We present a theory of cortico-hippocampal interaction in discrimination learning. The hippocampal region is presumed to form new stimulus representations which facilitate learning by enhancing the discriminability of predictive stimuli and compressing stimulus-stimulus redundancies. The cortical and cerebellar regions, which are the sites of long-term memory.


Hidden Markov Model Induction by Bayesian Model Merging

Neural Information Processing Systems

This paper describes a technique for learning both the number of states and the topology of Hidden Markov Models from examples. The induction process starts with the most specific model consistent with the training data and generalizes by successively merging states. Both the choice of states to merge and the stopping criterion are guided by the Bayesian posterior probability. We compare our algorithm with the Baum-Welch method of estimating fixed-size models, and find that it can induce minimal HMMs from data in cases where fixed estimation does not converge or requires redundant parameters to converge. 1 INTRODUCTION AND OVERVIEW Hidden Markov Models (HMMs) are a well-studied approach to the modelling of sequence data. HMMs can be viewed as a stochastic generalization of finite-state automata, where both the transitions between states and the generation of output symbols are governed by probability distributions.