Technology
Competitive Anti-Hebbian Learning of Invariants
Schraudolph, Nicol N., Sejnowski, Terrence J.
Although the detection of invariant structure in a given set of input patterns is vital to many recognition tasks, connectionist learning rules tend to focus on directions of high variance (principal components). The prediction paradigm is often used to reconcile this dichotomy; here we suggest a more direct approach to invariant learning based on an anti-Hebbian learning rule. An unsupervised tWO-layer network implementing this method in a competitive setting learns to extract coherent depth information from random-dot stereograms. 1 INTRODUCTION: LEARNING INVARIANT STRUCTURE Many connectionist learning algorithms share with principal component analysis (Jolliffe, 1986) the strategy of extracting the directions of highest variance from the input. A single Hebbian neuron, for instance, will come to encode the input's first principal component (Oja and Karhunen, 1985); various forms of lateral interaction can be used to force a layer of such nodes to differentiate and span the principal component subspace - cf. (Sanger, 1989; Kung, 1990; Leen, 1991), and others. The same type of representation also develops in the hidden layer of backpropagation autoassociator networks (Baldi and Hornik, 1989).
Repeat Until Bored: A Pattern Selection Strategy
An alternative to the typical technique of selecting training examples independently from a fixed distribution is fonnulated and analyzed, in which the current example is presented repeatedly until the error for that item is reduced to some criterion value,; then, another item is randomly selected.The convergence time can be dramatically increased or decreased by this heuristic, depending on the task, and is very sensitive to the value of .
Best-First Model Merging for Dynamic Learning and Recognition
Stephen M. Omohundro International Computer Science Institute 1947 CenteJ' Street, Suite 600 Berkeley, California 94704 Abstract "Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting.It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate theapproach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access. 1 TOWARD MORE COGNITIVE LEARNING Standard backpropagation neural networks learn in a way which appears to be quite different fromhuman leaming. Viewed as a cognitive system, a standard network always maintains acomplete model of its domain. This model is mostly wrong initially, but gets gradually better and better as data appears. The net deals with all data in much the same way and has no representation for the strength of evidence behind a certain conclusion. The network architecture is usually chosen before any data is seen and the processing is much the same in the early phases of learning as in the late phases.
Neural Computing with Small Weights
Siu, Kai-Yeung, Bruck, Jehoshua
Kai-Yeung Siu Dept. of Electrical & Computer Engineering University of California, Irvine Irvine, CA 92717 Jehoshua Bruck IBM Research Division Almaden Research Center San Jose, CA 95120-6099 Abstract An important issue in neural computation is the dynamic range of weights in the neural networks. Many experimental results on learning indicate that the weights in the networks can grow prohibitively large with the size of the inputs. Here we address this issue by studying the tradeoffs between the depth and the size of weights in polynomial-size networks of linear threshold elements (LTEs). We show that there is an efficient way of simulating a network of LTEs with large weights by a network of LTEs with small weights. To prove these results, we use tools from harmonic analysis of Boolean functions.
Some Approximation Properties of Projection Pursuit Learning Networks
Zhao, Ying, Atkeson, Christopher G.
Ying Zhao Christopher G. Atkeson The Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139 Abstract This paper will address an important question in machine learning: What kind of network architectures work better on what kind of problems? A projection pursuit learning network has a very similar structure to a one hidden layer sigmoidal neural network. A general method based on a continuous version of projection pursuit regression is developed to show that projection pursuit regression works better on angular smooth functions thanon Laplacian smooth functions. There exists a ridge function approximation scheme to avoid the curse of dimensionality for approximating functionsin L 2(ยขd). 1 INTRODUCTION Projection pursuit is a nonparametric statistical technique to find "interesting" low dimensional projections of high dimensional data sets. It has been used for nonparametric fitting and other data-analytic purposes (Friedman and Stuetzle, 1981, Huber, 1985).
Incrementally Learning Time-varying Half-planes
Kuh, Anthony, Petsche, Thomas, Rivest, Ronald L.
For a dichotomy, concept drift means that the classification function changes over time. We want to extend the theoretical analyses of learning to include time-varying concepts; to explore the behavior of current learning algorithms in the face of concept drift; and to devise tracking algorithms to better handle concept drift. In this paper, we briefly describe our theoretical model and then present the results of simulations *kuh@wiliki.eng.hawaii.edu