Information Technology
The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning
Bodenhausen, Ulrich, Waibel, Alex
In this work we describe a new method that adjusts time-delays and the widths of time-windows in artificial neural networks automatically. The input of the units are weighted by a gaussian input-window over time which allows the learning rules for the delays and widths to be derived in the same way as it is used for the weights. Our results on a phoneme classification task compare well with results obtained with the TDNN by Waibel et al., which was manually optimized for the same task.
Can neural networks do better than the Vapnik-Chervonenkis bounds?
These experiments are designed to test whether average generalization performance can surpass the worst-case bounds obtained from formal learning theory using the Vapnik-Chervonenkis dimension (Blumer et al., 1989). We indeed find that, in some cases, the average generalization is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 11m result of the bound. In other cases, we do find the 11m behavior of the VC bound, and in these cases, the numerical prefactor is closely related to prefactor contained in the bound.
Relaxation Networks for Large Supervised Learning Problems
Alspector, Joshua, Allen, Robert B., Jayakumar, Anthony, Zeppenfeld, Torsten, Meir, Ronny
Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.
Stereopsis by a Neural Network Which Learns the Constraints
Khotanzad, Alireza, Lee, Ying-Wung
This paper presents a neural network (NN) approach to the problem of stereopsis. The correspondence problem (finding the correct matches between the pixels of the epipolar lines of the stereo pair from amongst all the possible matches) is posed as a non-iterative many-to-one mapping. A two-layer feed forward NN architecture is developed to learn and code this nonlinear and complex mapping using the back-propagation learning rule and a training set. The important aspect of this technique is that none of the typical constraints such as uniqueness and continuity are explicitly imposed. All the applicable constraints are learned and internally coded by the NN enabling it to be more flexible and more accurate than the existing methods. The approach is successfully tested on several randomdot stereograms.It is shown that the net can generalize its learned mapping tocases outside its training set. Advantages over the Marr-Poggio Algorithm are discussed and it is shown that the NN performance is superIOr.
Asymptotic slowing down of the nearest-neighbor classifier
Snapp, Robert R., Psaltis, Demetri, Venkatesh, Santosh S.
Santosh S. Venkatesh Electrical Engineering University of Pennsylvania Philadelphia, PA 19104 If patterns are drawn from an n-dimensional feature space according to a probability distribution that obeys a weak smoothness criterion, we show that the probability that a random input pattern is misclassified by a nearest-neighbor classifier using M random reference patterns asymptotically satisfies a PM(error) "" Poo(error) M2/n' for sufficiently large values of M. Here, Poo(error) denotes the probability of error in the infinite sample limit, and is at most twice the error of a Bayes classifier. Although the value of the coefficient a depends upon the underlying probability distributions, the exponent of M is largely distribution free.We thus obtain a concise relation between a classifier's ability to generalize from a finite reference sample and the dimensionality of the feature space, as well as an analytic validation of Bellman's well known "curse of dimensionality." 1 INTRODUCTION One of the primary tasks assigned to neural networks is pattern classification.
Oriented Non-Radial Basis Functions for Image Coding and Analysis
Saha, Avijit, Christian, Jim, Tang, Dun-Sung, Chuan-Lin, Wu
We introduce oriented non-radial basis function networks (ONRBF) as a generalization of Radial Basis Function networks (RBF)- wherein the Euclidean distance metric in the exponent of the Gaussian is replaced bya more general polynomial. This permits the definition of more general regions and in particular-hyper-ellipses with orientations. Inthe case of hyper-surface estimation this scheme requires a smaller number of hidden units and alleviates the "curse of dimensionality" associatedkernel type approximators.In the case of an image, the hidden units correspond to features in the image and the parameters associated with each unit correspond to the rotation, scaling andtranslation properties of that particular "feature". In the context ofthe ONBF scheme, this means that an image can be represented by a small number of features. Since, transformation of an image by rotation, scaling and translation correspond to identical transformations of the individual features, the ONBF scheme can be used to considerable advantage for the purposes of image recognition and analysis.
A competitive modular connectionist architecture
Jacobs, Robert A., Jordan, Michael I.
We describe a multi-network, or modular, connectionist architecture that captures that fact that many tasks have structure at a level of granularity intermediate to that assumed by local and global function approximation schemes. The main innovation of the architecture is that it combines associative and competitive learning in order to learn task decompositions. A task decomposition is discovered by forcing the networks comprising the architecture to compete to learn the training patterns. As a result of the competition, different networks learn different training patterns and, thus, learn to partition the input space. The performance of the architecture on a "what" and "where" vision task and on a multi-payload robotics task are presented.
Self-organization of Hebbian Synapses in Hippocampal Neurons
Brown, Thomas H., Mainen, Zachary F., Zador, Anthony M., Claiborne, Brenda J.
We are exploring the significance of biological complexity for neuronal computation. Here we demonstrate that Hebbian synapses in realistically-modeled hippocampal pyramidal cells may give rise to two novel forms of self -organization in response to structured synaptic input. First, on the basis of the electrotonic relationships between synaptic contacts, a cell may become tuned to a small subset of its input space. Second, the same mechanisms may produce clusters of potentiated synapses across the space of the dendrites. The latter type of self-organization may be functionally significant in the presence of nonlinear dendritic conductances.
A Connectionist Learning Control Architecture for Navigation
A novel learning control architecture is used for navigation. A sophisticated test-bedis used to simulate a cylindrical robot with a sonar belt in a planar environment. The task is short-range homing in the presence ofobstacles. The robot receives no global information and assumes no comprehensive world model. Instead the robot receives only sensory information which is inherently limited. A connectionist architecture is presented which incorporates a large amount of a priori knowledge in the form of hard-wired networks, architectural constraints, and initial weights. Instead of hard-wiring static potential fields from object models, myarchitecture learnssensor-based potential fields, automatically adjusting them to avoid local minima and to produce efficient homing trajectories. It does this without object models using only sensory information. This research demonstrates the use of a large modular architecture on a difficult task.