Not enough data to create a plot.
Try a different view from the menu above.
Country
Single Transistor Learning Synapses
Hasler, Paul E., Diorio, Chris, Minch, Bradley A., Mead, Carver
The past few years have produced a number of efforts to design VLSI chips which "learn from experience." The first step toward this goal is developing a silicon analog for a synapse. We have successfully developed such a synapse using only 818 Paul Hasler, Chris Diorio, Bradley A. Minch, Carver Mead
A Mixture Model System for Medical and Machine Diagnosis
Stensmo, Magnus, Sejnowski, Terrence J.
Diagnosis of human disease or machine fault is a missing data problem since many variables are initially unknown. Additional information needs to be obtained. The j oint probability distribution of the data can be used to solve this problem. We model this with mixture models whose parameters are estimated by the EM algorithm. This gives the benefit that missing data in the database itself can also be handled correctly. The request for new information to refine the diagnosis is performed using the maximum utility principle. Since the system is based on learning it is domain independent and less labor intensive than expert systems or probabilistic networks. An example using a heart disease database is presented.
Reinforcement Learning with Soft State Aggregation
Singh, Satinder P., Jaakkola, Tommi, Jordan, Michael I.
It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortunately almost all of the theory of reinforcement learning assumes lookup table representations. In this paper we address the pressing issue of combining function approximation and RL, and present 1) a function approximator based on a simple extension to state aggregation (a commonly used form of compact representation), namely soft state aggregation, 2) a theory of convergence for RL with arbitrary, but fixed, soft state aggregation, 3) a novel intuitive understanding of the effect of state aggregation on online RL, and 4) a new heuristic adaptive state aggregation algorithm that finds improved compact representations by exploiting the non-discrete nature of soft state aggregation. Preliminary empirical results are also presented.
A Charge-Based CMOS Parallel Analog Vector Quantizer
Cauwenberghs, Gert, Pedroni, Volnei
We present an analog VLSI chip for parallel analog vector quantization. The MOSIS 2.0 J..Lm double-poly CMOS Tiny chip contains an array of 16 x 16 charge-based distance estimation cells, implementing a mean absolute difference (MAD) metric operating on a 16-input analog vector field and 16 analog template vectors.
On-line Learning of Dichotomies
Barkai, N., Seung, H. S., Sompolinsky, H.
The performance of online algorithms for learning dichotomies is studied. In online learning, the number of examples P is equivalent to the learning time, since each example is presented only once. The learning curve, or generalization error as a function of P, depends on the schedule at which the learning rate is lowered.
Using Voice Transformations to Create Additional Training Talkers for Word Spotting
Chang, Eric I., Lippmann, Richard P.
Lack of training data has always been a constraint in training speech recognizers. This research presents a voice transformation technique which increases the variety among training talkers. The resulting more varied training set provided up to 2.9 percentage points of improvement in the figure of merit (average detection rate) of a high performance word spotter. This improvement is similar to the increase in performance provided by doubling the amount of training data (Carlson, 1994). This technique can also be applied to other speech recognition systems such as continuous speech recognition, talker identification, and isolated speech recognition.
Estimating Conditional Probability Densities for Periodic Variables
Bishop, Chris M., Legleye, Claire
Many applications of neural networks can be formulated in terms of a multivariate nonlinear mapping from an input vector x to a target vector t. A conventional neural network approach, based on least squares for example, leads to a network mapping which approximates the regression of t on x. A more complete description of the data can be obtained by estimating the conditional probability density of t, conditioned on x, which we write as p(tlx). Various techniques exist for modelling such densities when the target variables live in a Euclidean space. However, a number of potential applications involve angle-like output variables which are periodic on some finite interval (usually chosen to be (0,271")).
Interference in Learning Internal Models of Inverse Dynamics in Humans
Shadmehr, Reza, Brashers-Krug, Tom, Mussa-Ivaldi, Ferdinando A.
Experiments were performed to reveal some of the computational properties of the human motor memory system. We show that as humans practice reaching movements while interacting with a novel mechanical environment, they learn an internal model of the inverse dynamics of that environment. The representation of the internal model in memory is such that there is interference when there is an attempt to learn a new inverse dynamics map immediately after an anticorrelated mapping was learned. We suggest that this interference is an indication that the same computational elements used to encode the first inverse dynamics map are being used to learn the second mapping. We predict that this leads to a forgetting of the initially learned skill. 1 Introduction In tasks where we use our hands to interact with a tool, our motor system develops a model of the dynamics of that tool and uses this model to control the coupled dynamics of our arm and the tool (Shadmehr and Mussa-Ivaldi 1994). In physical systems theory, the tool is a mechanical analogue of an admittance, mapping a force as input onto a change in state as output (Hogan 1985).
Learning with Preknowledge: Clustering with Point and Graph Matching Distance Measures
Gold, Steven, Rangarajan, Anand, Mjolsness, Eric
Recently, the importance of such preknowledge for learning has been convincingly argued from a statistical framework [Geman et al., 1992]. Researchers have proposed that our brains may incorporate preknowledge in the form of distance measures [Shepard, 1989]. The neural network community has begun to explore this idea via tangent distance [Simard et al., 1993], model learning [Williams et al., 1993] and point matching distances [Gold et al., 1994]. However, only the point matching distances have been invariant under permutations. Here we extend that work by enhancing both the scope and function of those distance measures, significantly expanding the problem domains where learning may take place. We learn objects consisting of noisy 2-D point-sets or noisy weighted graphs by clustering with point matching and graph matching distance measures. The point matching measure is approx.
Temporal Dynamics of Generalization in Neural Networks
Wang, Changfeng, Venkatesh, Santosh S.
This paper presents a rigorous characterization of how a general nonlinear learning machine generalizes during the training process when it is trained on a random sample using a gradient descent algorithm based on reduction of training error. It is shown, in particular, that best generalization performance occurs, in general, before the global minimum of the training error is achieved. The different roles played by the complexity of the machine class and the complexity of the specific machine in the class during learning are also precisely demarcated. 1 INTRODUCTION In learning machines such as neural networks, two major factors that affect the'goodness of fit' of the examples are network size (complexity) and training time. These are also the major factors that affect the generalization performance of the network. Many theoretical studies exploring the relation between generalization performance and machine complexity support the parsimony heuristics suggested by Occam's razor, to wit that amongst machines with similar training performance one should opt for the machine of least complexity.