Statistical Learning
Fast Non-Linear Dimension Reduction
Kambhatla, Nanda, Leen, Todd K.
We propose a new distance measure which is optimal for the task of local PCA. Our results with speech and image data indicate that the nonlinear techniques provide more accurate encodings than PCA. Our local linear algorithm produces more accurate encodings (except for one simulation with image data), and trains much faster than five layer auto-associative networks. Acknowledgments This work was supported by grants from the Air Force Office of Scientific Research (F49620-93-1-0253) and Electric Power Research Institute (RP8015-2). The authors are grateful to Gary Cottrell and David DeMers for providing their image database and clarifying their experimental results. We also thank our colleagues in the Center for Spoken Language Understanding at OGI for providing speech data.
Memory-Based Methods for Regression and Classification
Dietterich, Thomas G., Wettschereck, Dietrich, Atkeson, Chris G., Moore, Andrew W.
Moore School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Memory-based learning methods operate by storing all (or most) of the training data and deferring analysis of that data until "run time" (i.e., when a query is presented and a decision or prediction must be made). When a query is received, these methods generally answer the query by retrieving and analyzing a small subset of the training data-namely, data in the immediate neighborhood of the query point. In short, memory-based methods are "lazy" (they wait until the query) and "local" (they use only a local neighborhood). The purpose of this workshop was to review the state-of-the-art in memory-based methods and to understand their relationship to "eager" and "global" learning algorithms such as batch backpropagation. There are two essential components to any memory-based algorithm: the method for defining the "local neighborhood" and the learning method that is applied to the training examples in the local neighborhood.
Clustering with a Domain-Specific Distance Measure
Gold, Steven, Mjolsness, Eric, Rangarajan, Anand
Critical features of a domain (such as invariance under translation, rotation, and permu- Clustering with a Domain-Specific Distance Measure 103 tation) are captured within the clustering procedure, rather than reflected in the properties of feature sets created prior to clustering. The distance measure and learning problem are formally described as nested objective functions. We derive an efficient algorithm by using optimization techniques that allow us to divide up the objective function into parts which may be minimized in distinct phases. The algorithm has accurately recreated 10 prototypes from a randomly generated sample database of 100 images consisting of 20 points each in 120 experiments. Finally, by incorporating permutation invariance in our distance measure, we have a technique that we may be able to apply to the clustering of graphs. Our goal is to develop measures which will enable the learning of objects with shape or structure. Acknowledgements This work has been supported by AFOSR grant F49620-92-J-0465 and ONR/DARPA grant N00014-92-J-4048.
A Comparative Study of a Modified Bumptree Neural Network with Radial Basis Function Networks and the Standard Multi Layer Perceptron
Bostock, Richard T. J., Harget, Alan J.
Bumptrees are geometric data structures introduced by Omohundro (1991) to provide efficient access to a collection of functions on a Euclidean space of interest. We describe a modified bumptree structure that has been employed as a neural network classifier, and compare its performance on several classification tasks against that of radial basis function networks and the standard mutIi-Iayer perceptron. 1 INTRODUCTION A number of neural network studies have demonstrated the utility of the multi-layer perceptron (MLP) and shown it to be a highly effective paradigm. Studies have also shown, however, that the MLP is not without its problems, in particular it requires an extensive training time, is susceptible to local minima problems and its perfonnance is dependent upon its internal network architecture. In an attempt to improve upon the generalisation performance and computational efficiency a number of studies have been undertaken principally concerned with investigating the parametrisation of the MLP. It is well known, for example, that the generalisation performance of the MLP is affected by the number of hidden units in the network, which have to be determined empirically since theory provides no guidance.
Classifying Hand Gestures with a View-Based Distributed Representation
Darrell, Trevor J., Pentland, Alex P.
We present a method for learning, tracking, and recognizing human hand gestures recorded by a conventional CCD camera without any special gloves or other sensors. A view-based representation is used to model aspects of the hand relevant to the trained gestures, and is found using an unsupervised clustering technique. We use normalized correlation networks, withdynamic time warping in the temporal domain, as a distance function for unsupervised clustering. Views are computed separably for space and time dimensions; the distributed response of the combination of these units characterizes the input data with a low dimensional representation. Asupervised classification stage uses labeled outputs of the spatiotemporal units as training data. Our system can correctly classify gestures in real time with a low-cost image processing accelerator.
A Learning Analog Neural Network Chip with Continuous-Time Recurrent Dynamics
The recurrent network,containing six continuous-time analog neurons and 42 free parameters (connection strengths and thresholds), is trained to generate time-varying outputs approximating given periodic signals presented to the network. The chip implements a stochastic perturbative algorithm,which observes the error gradient along random directions in the parameter space for error-descent learning. In addition tothe integrated learning functions and the generation of pseudo-random perturbations, the chip provides for teacher forcing andlong-term storage of the volatile parameters. The network learns a 1 kHz circular trajectory in 100 sec. The chip occupies 2mm x 2mm in a 2JLm CMOS process, and dissipates 1.2 mW. 1 Introduction Exact gradient-descent algorithms for supervised learning in dynamic recurrent networks [1-3]are fairly complex and do not provide for a scalable implementation in a standard 2-D VLSI process. We have implemented a fairly simple and scalable ยทPresent address: Johns Hopkins University, ECE Dept., Baltimore MD 21218-2686.
A Hybrid Radial Basis Function Neurocomputer and Its Applications
Watkins, Steven S., Chau, Paul M., Tawel, Raoul, Lambrigtsen, Bjorn, Plutowski, Mark
A neurocomputer was implemented using radial basis functions and a combination of analog and digital VLSI circuits. The hybrid system uses custom analog circuits for the input layer and a digital signal processing board for the hidden and output layers. The system combines the advantages of both analog and digital circuits.
Exploiting Chaos to Control the Future
Flake, Gary W., Sun, Guo-Zhen, Lee, Yee-Chun
Recently, Ott, Grebogi and Yorke (OGY) [6] found an effective method to control chaotic systems to unstable fixed points by using onlysmall control forces; however, OGY's method is based on and limited to a linear theory and requires considerable knowledge of the dynamics of the system to be controlled. In this paper we use two radial basis function networks: one as a model of an unknown plant and the other as the controller. The controller is trained with a recurrent learning algorithm to minimize a novel objective function such that the controller can locate an unstable fixed point and drive the system into the fixed point with no a priori knowledge ofthe system dynamics. Our results indicate that the neural controller offers many advantages over OGY's technique.
A Hodgkin-Huxley Type Neuron Model That Learns Slow Non-Spike Oscillation
Doya, Kenji, Selverston, Allen I., Rowat, Peter F.
A gradient descent algorithm for parameter estimation which is similar to those used for continuous-time recurrent neural networks was derived for Hodgkin-Huxley type neuron models. Using membrane potentialtrajectories as targets, the parameters (maximal conductances, thresholds and slopes of activation curves, time constants) weresuccessfully estimated. The algorithm was applied to modeling slow non-spike oscillation of an identified neuron in the lobster stomatogastric ganglion. A model with three ionic currents was trained with experimental data. It revealed a novel role of A-current for slow oscillation below -50 mY. 1 INTRODUCTION Conductance-based neuron models, first formulated by Hodgkin and Huxley [10], are commonly used for describing biophysical mechanisms underlying neuronal behavior.