Not enough data to create a plot.
Try a different view from the menu above.
Diffusion of Credit in Markovian Models
Bengio, Yoshua, Frasconi, Paolo
This paper studies the problem of diffusion in Markovian models, such as hidden Markov models (HMMs) and how it makes very difficult the task of learning of long-term dependencies in sequences. Using results from Markov chain theory, we show that the problem of diffusion is reduced if the transition probabilities approach 0 or 1. Under this condition, standard HMMs have very limited modeling capabilities, but input/output HMMs can still perform interesting computations.
A Charge-Based CMOS Parallel Analog Vector Quantizer
Cauwenberghs, Gert, Pedroni, Volnei
We present an analog VLSI chip for parallel analog vector quantization. The MOSIS 2.0 J..Lm double-poly CMOS Tiny chip contains an array of 16 x 16 charge-based distance estimation cells, implementing a mean absolute difference (MAD) metric operating on a 16-input analog vector field and 16 analog template vectors.
A Connectionist Technique for Accelerated Textual Input: Letting a Network Do the Typing
Each year people spend a huge amount of time typing. The text people type typically contains a tremendous amount of redundancy due to predictable word usage patterns and the text's structure. This paper describes a neural network system call AutoTypist that monitors a person's typing and predicts what will be entered next. AutoTypist displays the most likely subsequent word to the typist, who can accept it with a single keystroke, instead of typing it in its entirety. The multi-layer perceptron at the heart of Auto'JYpist adapts its predictions of likely subsequent text to the user's word usage pattern, and to the characteristics of the text currently being typed. Increases in typing speed of 2-3% when typing English prose and 10-20% when typing C code have been demonstrated using the system, suggesting a potential time savings of more than 20 hours per user per year. In addition to increasing typing speed, AutoTypist reduces the number of keystrokes a user must type by a similar amount (2-3% for English, 10-20% for computer programs). This keystroke savings has the potential to significantly reduce the frequency and severity of repeated stress injuries caused by typing, which are the most common injury suffered in today's office environment.
A Mixture Model System for Medical and Machine Diagnosis
Stensmo, Magnus, Sejnowski, Terrence J.
Diagnosis of human disease or machine fault is a missing data problem since many variables are initially unknown. Additional information needs to be obtained. The j oint probability distribution of the data can be used to solve this problem. We model this with mixture models whose parameters are estimated by the EM algorithm. This gives the benefit that missing data in the database itself can also be handled correctly. The request for new information to refine the diagnosis is performed using the maximum utility principle. Since the system is based on learning it is domain independent and less labor intensive than expert systems or probabilistic networks. An example using a heart disease database is presented.
Grouping Components of Three-Dimensional Moving Objects in Area MST of Visual Cortex
Zemel, Richard S., Sejnowski, Terrence J.
Previous investigators have suggested that these cells may represent self-motion. Spiral patterns can also be generated by the relative motion of the observer and a particular object. An MST cell may then account for some portion of the complex flow field, and the set of active cells could encode the entire flow; in this manner, MST effectively segments moving objects. Such a grouping operation is essential in interpreting scenes containing several independent moving objects and observer motion. We describe a model based on the hypothesis that the selective tuning of MST cells reflects the grouping of object components undergoing coherent motion. Inputs to the model were generated from sequences of ray-traced images that simulated realistic motion situations, combining observer motion, eye movements, and independent object motion. The input representation was modeled after response properties of neurons in area MT, which provides the primary input to area MST. After applying an unsupervised learning algorithm, the units became tuned to patterns signaling coherent motion. The results match many of the known properties of MST cells and are consistent with recent studies indicating that these cells process 3-D object motion information.
Combining Estimators Using Non-Constant Weighting Functions
Tresp, Volker, Taniguchi, Michiaki
This paper discusses the linearly weighted combination of estimators in which the weighting functions are dependent on the input. We show that the weighting functions can be derived either by evaluating the input dependent variance of each estimator or by estimating how likely it is that a given estimator has seen data in the region of the input space close to the input pattern. The latter solution is closely related to the mixture of experts approach and we show how learning rules for the mixture of experts can be derived from the theory about learning with missing features. The presented approaches are modular since the weighting functions can easily be modified (no retraining) if more estimators are added. Furthermore, it is easy to incorporate estimators which were not derived from data such as expert systems or algorithms.
Using a neural net to instantiate a deformable model
Williams, Christopher K. I., Revow, Michael, Hinton, Geoffrey E.
Deformable models are an attractive approach to recognizing nonrigid objects which have considerable within class variability. However, there are severe search problems associated with fitting the models to data. We show that by using neural networks to provide better starting points, the search time can be significantly reduced. The method is demonstrated on a character recognition task. In previous work we have developed an approach to handwritten character recognition based on the use of deformable models (Hinton, Williams and Revow, 1992a; Revow, Williams and Hinton, 1993). We have obtained good performance with this method, but a major problem is that the search procedure for fitting each model to an image is very computationally intensive, because there is no efficient algorithm (like dynamic programming) for this task. In this paper we demonstrate that it is possible to "compile down" some of the knowledge gained while fitting models to data to obtain better starting points that significantly reduce the search time. 1 DEFORMABLE MODELS FOR DIGIT RECOGNITION The basic idea in using deformable models for digit recognition is that each digit has a model, and a test image is classified by finding the model which is most likely to have generated it. The quality of the match between model and test image depends on the deformation of the model, the amount of ink that is attributed to noise and the distance of the remaining ink from the deformed model.
An Actor/Critic Algorithm that is Equivalent to Q-Learning
Crites, Robert H., Barto, Andrew G.
We prove the convergence of an actor/critic algorithm that is equivalent to Q-Iearning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using criteria that depend on the relative probability of the action that was executed.
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based) to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically rich sentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, the model included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.
A Neural Model of Delusions and Hallucinations in Schizophrenia
Ruppin, Eytan, Reggia, James A., Horn, David
We implement and study a computational model of Stevens' [19921 theory of the pathogenesis of schizophrenia. This theory hypothesizes that the onset of schizophrenia is associated with reactive synaptic regeneration occurring in brain regions receiving degenerating temporal lobe projections. Concentrating on one such area, the frontal cortex, we model a frontal module as an associative memory neural network whose input synapses represent incoming temporal projections. We analyze how, in the face of weakened external input projections, compensatory strengthening of internal synaptic connections and increased noise levels can maintain memory capacities (which are generally preserved in schizophrenia). However, These compensatory changes adversely lead to spontaneous, biased retrieval of stored memories, which corresponds to the occurrence of schizophrenic delusions and hallucinations without any apparent external trigger, and for their tendency to concentrate on just few central themes. Our results explain why these symptoms tend to wane as schizophrenia progresses, and why delayed therapeutical intervention leads to a much slower response.