Information Technology
A Large-Scale Neural Network Which Recognizes Handwritten Kanji Characters
We propose a new way to construct a large-scale neural network for 3.000 handwritten Kanji characters recognition. This neural network consists of 3 parts: a collection of small-scale networks which are trained individually on a small number of Kanji characters; a network which integrates the output from the small-scale networks, and a process to facilitate the integration of these neworks. The recognition rate of the total system is comparable with those of the small-scale networks. Our results indicate that the proposed method is effective for constructing a large-scale network without loss of recognition performance.
Dynamic Behavior of Constained Back-Propagation Networks
It is generally admitted that generalization performance of back-propagation networks (Rumelhart, Hinton & Williams, 1986) will depend on the relative size ofthe training data and of the trained network. By analogy to curve-fitting and for theoretical considerations, the generalization performance of the network should decrease as the size of the network and the associated number of degrees of freedom increase (Rumelhart, 1987; Denker et al., 1987; Hanson & Pratt, 1989). This paper examines the dynamics of the standard back-propagation algorithm (BP) and of a constrained back-propagation variation (CBP), designed to adapt the size of the network to the training data base. The performance, learning dynamics and the representations resulting from the two algorithms are compared.
Coupled Markov Random Fields and Mean Field Theory
Geiger, Davi, Girosi, Federico
In recent years many researchers have investigated the use of Markov Random Fields (MRFs) for computer vision. They can be applied for example to reconstruct surfaces from sparse and noisy depth data coming from the output of a visual process, or to integrate early vision processes to label physical discontinuities. In this paper we show that by applying mean field theory to those MRFs models a class of neural networks is obtained. Those networks can speed up the solution for the MRFs models. The method is not restricted to computer vision. 1 Introduction
Sequential Decision Problems and Neural Networks
Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.
Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.
Predicting Weather Using a Genetic Memory: A Combination of Kanerva's Sparse Distributed Memory with Holland's Genetic Algorithms
Kanerva's sparse distributed memory (SDM) is an associative-memory model based on the mathematical properties of high-dimensional binary address spaces. Holland's genetic algorithms are a search technique for high-dimensional spaces inspired by evolutionary processes of DNA. "Genetic Memory" is a hybrid of the above two systems, in which the memory uses a genetic algorithm to dynamically reconfigure its physical storage locations to reflect correlations between the stored addresses and data. For example, when presented with raw weather station data, the Genetic Memory discovers specific features in the weather data which correlate well with upcoming rain, and reconfigures the memory to utilize this information effectively. This architecture is designed to maximize the ability of the system to scale-up to handle real-world problems.
A Reconfigurable Analog VLSI Neural Network Chip
Satyanarayana, Srinagesh, Tsividis, Yannis P., Graf, Hans Peter
The distributed-neuron synapses are arranged in blocks of 16, which we call '4 x 4 tiles'. Switch matrices are interleaved between each of these tiles to provide programmability of interconnections. With a small area overhead (15 %), the 1024 units of the network can be rearranged in various configurations. Some of the possible configurations are, a 12-32-12 network, a 16-12-12-16 network, two 12-32 networks etc. (the numbers separated by dashes indicate the number of units per layer, including the input layer). Weights are stored in analog form on MaS capacitors.
The Effect of Catecholamines on Performance: From Unit to System Behavior
Servan-Schreiber, David, Printz, Harry, Cohen, Jonathan D.
We present a model of catecholamine effects in a network of neural-like elements. We argue that changes in the responsivity of individual elements do not affect their ability to detect a signal and ignore noise. However. the same changes in cell responsivity in a network of such elements do improve the signal detection performance of the network as a whole. We show how this result can be used in a computer simulation of behavior to account for the effect of eNS stimulants on the signal detection performance of human subjects.
The Perceptron Algorithm Is Fast for Non-Malicious Distributions
Interest in this algorithm waned in the 1970's after it was emphasized[Minsky and Papert, 1969] (1) that the class of problems solvable by a single half space was limited, and (2) that the Perceptron algorithm, although converging in finite time, did not converge in polynomial time. In the 1980's, however, it has become evident that there is no hope of providing a learning algorithm which can learn arbitrary functions in polynomial time and much research has thus been restricted to algorithms which learn a function drawn from a particular class of functions. Moreover, learning theory has focused on protocols like that of [Valiant, 1984] where we seek to classify, not a fixed set of examples, but examples drawn from a probability distribution. This allows a natural notion of "generalization". There are very few classes which have yet been proven learnable in polynomial time, and one of these is the class of half spaces. Thus there is considerable theoretical interest now in studying the problem of learning a single half space, and so it is natural to reexamine the Percept ron algorithm within the formalism of Valiant.
Using Local Models to Control Movement
This paper explores the use of a model neural network for motor learning. Steinbuch and Taylor presented neural network designs to do nearest neighbor lookup in the early 1960s. In this paper their nearest neighbor network is augmented with a local model network, which fits a local model to a set of nearest neighbors. The network design is equivalent to local regression. This network architecture can represent smooth nonlinear functions, yet has simple training rules with a single global optimum.