Propagation Filters in PDS Networks for Sequencing and Ambiguity Resolution
Sumida, Ronald A., Dyer, Michael G.
We present a Parallel Distributed Semantic (PDS) Network architecture that addresses the problems of sequencing and ambiguity resolution in natural language understanding. A PDS Network stores phrases and their meanings using multiple PDP networks, structured in the form of a semantic net.A mechanism called Propagation Filters is employed: (1) to control communication between networks, (2) to properly sequence the components of a phrase, and (3) to resolve ambiguities. Simulation results indicate that PDS Networks and Propagation Filters can successfully represent high-levelknowledge, can be trained relatively quickly, and provide for parallel inferencing at the knowledge level. 1 INTRODUCTION Backpropagation has shown considerable potential for addressing problems in natural languageprocessing (NLP). However, the traditional PDP [Rumelhart and McClelland, 1986] approach of using one (or a small number) of backprop networks for NLP has been plagued by a number of problems: (1) it has been largely unsuccessful atrepresenting high-level knowledge, (2) the networks are slow to train, and (3) they are sequential at the knowledge level. A solution to these problems is to represent high-level knowledge structures over a large number of smaller PDP net-233 234 Sumida and Dyer works.
Computer Recognition of Wave Location in Graphical Data by a Neural Network
PA 15261 Abstract Five experiments were performed using several neural network architectures to identify the location of a wave in the time ordered graphical results from a medical test. Baseline results from the first experiment found correct identification of the target wave in 85% of cases (n 20). Other experiments investigated the effect of different architectures and preprocessing the raw data on the results. The methods used seem most appropriate for time oriented graphical data which has a clear starting point such as electrophoresis Or spectrometry rather than continuous teSts such as ECGs and EEGs. I INTRODUCTION Complex wave form recognition is generally considered to be a difficult task for machines. Analytical approaches to this problem have been described and they work with reasonable accuracy (Gabriel et al. 1980.
Refining PID Controllers using Neural Networks
Scott, Gary M., Shavlik, Jude W., Ray, W. Harmon
We apply this method to the task of controlling the outflow and temperature of a water tank, producing statistically-significant gains in accuracy over both a standard neural network approach and a non-learning PID controller. Furthermore,using the PID knowledge to initialize the weights of the network produces statistically less variation in testset accuracy when compared to networks initialized with small random numbers.
Segmentation Circuits Using Constrained Optimization
Analog hardware has obvious advantages in terms of its size, speed, cost, and power consumption. Analog chip designers, however, should not feel constrained to mapping existingdigital algorithms to silicon. Many times, new algorithms must be adapted or invented to ensure efficient implementation in analog hardware. Novel analog algorithms embedded in the hardware must be simple and obey the natural constraints of physics. Much algorithm intuition can be gained from experimenting with these continuous-time nonlinear systems. For example, the algorithm described in this paper arose from experimentation with existing analog segmentation hardware. Surprisingly,many of these "analog" algorithms may prove useful even if a computer vision researcher is limited to simulating the analog hardware on a digital computer [7] .
A Network of Localized Linear Discriminants
The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segmentsare an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4],and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space). Incremental corrections to V and M accrue for each training token feature vector Yj in the training set, as iIlustrated in figure 1 (exaggerated magnitudes).
Forward Dynamics Modeling of Speech Motor Control Using Physiological Data
Hirayama, Makoto, Vatikiotis-Bateson, Eric, Kawato, Mitsuo, Jordan, Michael I.
We propose a paradigm for modeling speech production based on neural networks. We focus on characteristics of the musculoskeletal system. Using real physiological data - articulator movements and EMG from muscle activitya neuralnetwork learns the forward dynamics relating motor commands to muscles and the ensuing articulator behavior. After learning, simulated perturbations, were used to asses properties of the acquired model, such as natural frequency, damping, and interarticulator couplings. Finally, a cascade neural network is used to generate continuous motor commands from a sequence of discrete articulatory targets.
Gradient Descent: Second Order Momentum and Saturating Error
We then regard gradient descent with momentum as a dynamic system and explore a non quadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics. 1 INTRODUCTION Gradient descent is the bread-and-butter optimization technique in neural networks. Some people build special purpose hardware to accelerate gradient descent optimization ofbackpropagation networks. Understanding the dynamics of gradient descent on such surfaces is therefore of great practical value. Here we briefly review the known results in the convergence of batch gradient descent; showthat second-order momentum does not give any speedup; simulate a real network and observe some effect not predicted by theory; and account for these effects by analyzing gradient descent with momentum on a saturating error surface.