Goto

Collaborating Authors

 Technology


Time-Warping Network: A Hybrid Framework for Speech Recognition

Neural Information Processing Systems

Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we define a time-warping (1W) neuron that extends the operation of the fonnal neuron of a back-propagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMMbased recognitionsystem.



Connectionist Optimisation of Tied Mixture Hidden Markov Models

Neural Information Processing Systems

Horacio Franco Michael Cohen SRI International Menlo Park CA 94025 USA Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of radial basisfunctions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some outstanding problemswith discriminative training.


Rule Induction through Integrated Symbolic and Subsymbolic Processing

Neural Information Processing Systems

We describe a neural network, called RufeNet, that learns explicit, symbolic condition-action rules in a formal string manipulation domain. of the domain,RuleNet discovers functional categories over elements and, at various points during learning, extracts rules that operate on these categories. The rules are then injected back into RuleNet and in a process called iterative projection. By incorporatingtraining continues, rules in this way, RuleNet exhibits enhanced learning and generalization performance over alternative neural net approaches. By integrating symbolic rule learning and subsymbolic category learning, RuleNet has capabilities that go beyond a purely symbolic system. We show how this architecture can be applied to the problem of case-role assignment in natural language processing, yielding a novel rule-based solution.


Recognition of Manipulated Objects by Motor Learning

Neural Information Processing Systems

We present two neural network controller learning schemes based on feedbackerror-learning and modular architecture for recognition and control of multiple manipulated objects. In the first scheme, a Gating Network is trained to acquire of a number of objects (or sets ofobject-specific representations for recognition objects). In the second scheme, an Estimation Network is trained to acquire function-specific, rather than object-specific, representations which directly estimate to identify manipulatedphysical parameters. Both recognition networks are trained objects using somatic and/or visual information. After learning, appropriate motor commands for manipulation of each object are issued by the control networks.


Hierarchical Transformation of Space in the Visual System

Neural Information Processing Systems

Neurons encoding simple visual features in area VI such as orientation, direction of motion and color are organized in retinotopic maps. However, recentphysiological experiments have shown that the responses of many neurons in VI and other cortical areas are modulated by the direction ofgaze. We have developed a neural network model of the visual cortex to explore the hypothesis that visual features are encoded in headcentered coordinatesat early stages of visual processing. New experiments are suggested for testing this hypothesis using electrical stimulations and psychophysical observations.


A Network of Localized Linear Discriminants

Neural Information Processing Systems

The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segmentsare an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4],and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space). Incremental corrections to V and M accrue for each training token feature vector Yj in the training set, as iIlustrated in figure 1 (exaggerated magnitudes).


Forward Dynamics Modeling of Speech Motor Control Using Physiological Data

Neural Information Processing Systems

We propose a paradigm for modeling speech production based on neural networks. We focus on characteristics of the musculoskeletal system. Using real physiological data - articulator movements and EMG from muscle activitya neuralnetwork learns the forward dynamics relating motor commands to muscles and the ensuing articulator behavior. After learning, simulated perturbations, were used to asses properties of the acquired model, such as natural frequency, damping, and interarticulator couplings. Finally, a cascade neural network is used to generate continuous motor commands from a sequence of discrete articulatory targets.



Gradient Descent: Second Order Momentum and Saturating Error

Neural Information Processing Systems

We then regard gradient descent with momentum as a dynamic system and explore a non quadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics. 1 INTRODUCTION Gradient descent is the bread-and-butter optimization technique in neural networks. Some people build special purpose hardware to accelerate gradient descent optimization ofbackpropagation networks. Understanding the dynamics of gradient descent on such surfaces is therefore of great practical value. Here we briefly review the known results in the convergence of batch gradient descent; showthat second-order momentum does not give any speedup; simulate a real network and observe some effect not predicted by theory; and account for these effects by analyzing gradient descent with momentum on a saturating error surface.