AITopics

Although the detection of invariant structure in a given set of input patterns is vital to many recognition tasks, connectionist learning rules tend to focus on directions of high variance (principal components). The prediction paradigm is often used to reconcile this dichotomy; here we suggest a more direct approach to invariant learning based on an anti-Hebbian learning rule. An unsupervised tWO-layer network implementing this method in a competitive setting learns to extract coherent depth information from random-dot stereograms. 1 INTRODUCTION: LEARNING INVARIANT STRUCTURE Many connectionist learning algorithms share with principal component analysis (Jolliffe, 1986) the strategy of extracting the directions of highest variance from the input. A single Hebbian neuron, for instance, will come to encode the input's first principal component (Oja and Karhunen, 1985); various forms of lateral interaction can be used to force a layer of such nodes to differentiate and span the principal component subspace - cf. (Sanger, 1989; Kung, 1990; Leen, 1991), and others. The same type of representation also develops in the hidden layer of backpropagation autoassociator networks (Baldi and Hornik, 1989).

artificial intelligence, disparity, machine learning, (15 more...)

Country: North America > United States > California > San Diego County (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Repeat Until Bored: A Pattern Selection Strategy

Munro, Paul W.

An alternative to the typical technique of selecting training examples independently from a fixed distribution is fonnulated and analyzed, in which the current example is presented repeatedly until the error for that item is reduced to some criterion value,; then, another item is randomly selected.The convergence time can be dramatically increased or decreased by this heuristic, depending on the task, and is very sensitive to the value of .

artificial intelligence, machine learning, selection, (19 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Nowlan, Steven J., Hinton, Geoffrey E.

Adaptive Soft Weight Tying using Gaussian Mixtures

One way of simplifying neural networks so they generalize better is to add an extra t.erm

adaptive soft weight tying, artificial intelligence, machine learning, (15 more...)

Country:

North America > United States > California (0.28)
North America > Canada > Ontario > Toronto (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Best-First Model Merging for Dynamic Learning and Recognition

Omohundro, Stephen M.

Stephen M. Omohundro International Computer Science Institute 1947 CenteJ' Street, Suite 600 Berkeley, California 94704 Abstract "Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting.It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate theapproach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access. 1 TOWARD MORE COGNITIVE LEARNING Standard backpropagation neural networks learn in a way which appears to be quite different fromhuman leaming. Viewed as a cognitive system, a standard network always maintains acomplete model of its domain. This model is mostly wrong initially, but gets gradually better and better as data appears. The net deals with all data in much the same way and has no representation for the strength of evidence behind a certain conclusion. The network architecture is usually chosen before any data is seen and the processing is much the same in the early phases of learning as in the late phases.

Krogh, Anders, Hertz, John A.

A Simple Weight Decay Can Improve Generalization

It has been observed in numerical simulations that a weight decay can improve generalizationin a feed-forward neural network.

artificial intelligence, machine learning, weight decay, (14 more...)

Country:

Europe (0.29)
North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Siu, Kai-Yeung, Bruck, Jehoshua

Neural Computing with Small Weights

Kai-Yeung Siu Dept. of Electrical & Computer Engineering University of California, Irvine Irvine, CA 92717 Jehoshua Bruck IBM Research Division Almaden Research Center San Jose, CA 95120-6099 Abstract An important issue in neural computation is the dynamic range of weights in the neural networks. Many experimental results on learning indicate that the weights in the networks can grow prohibitively large with the size of the inputs. Here we address this issue by studying the tradeoffs between the depth and the size of weights in polynomial-size networks of linear threshold elements (LTEs). We show that there is an efficient way of simulating a network of LTEs with large weights by a network of LTEs with small weights. To prove these results, we use tools from harmonic analysis of Boolean functions.

logic & formal reasoning, machine learning, n-bit number, (16 more...)

Country: North America > United States > California > Orange County > Irvine (0.54)

Industry:

Government > Regional Government > North America Government > United States Government (0.69)
Government > Military (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.38)

Zhao, Ying, Atkeson, Christopher G.

Some Approximation Properties of Projection Pursuit Learning Networks

Ying Zhao Christopher G. Atkeson The Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139 Abstract This paper will address an important question in machine learning: What kind of network architectures work better on what kind of problems? A projection pursuit learning network has a very similar structure to a one hidden layer sigmoidal neural network. A general method based on a continuous version of projection pursuit regression is developed to show that projection pursuit regression works better on angular smooth functions thanon Laplacian smooth functions. There exists a ridge function approximation scheme to avoid the curse of dimensionality for approximating functionsin L 2(¢d). 1 INTRODUCTION Projection pursuit is a nonparametric statistical technique to find "interesting" low dimensional projections of high dimensional data sets. It has been used for nonparametric fitting and other data-analytic purposes (Friedman and Stuetzle, 1981, Huber, 1985).

artificial intelligence, machine learning, smooth function, (12 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.24)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Ji, Chuanyi, Psaltis, Demetri

The VC-Dimension versus the Statistical Capacity of Multilayer Networks

The former characterizes their "Present Address: Department of Electrical Computer and System Engineering, Rensselaer Polytech Institute, Troy, NY 12180.

artificial intelligence, machine learning, vc-dimension, (16 more...)

Country: North America > United States > New York > Rensselaer County > Troy (0.24)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.50)

Kuh, Anthony, Petsche, Thomas, Rivest, Ronald L.

Incrementally Learning Time-varying Half-planes

For a dichotomy, concept drift means that the classification function changes over time. We want to extend the theoretical analyses of learning to include time-varying concepts; to explore the behavior of current learning algorithms in the face of concept drift; and to devise tracking algorithms to better handle concept drift. In this paper, we briefly describe our theoretical model and then present the results of simulations *kuh@wiliki.eng.hawaii.edu

adversary, artificial intelligence, machine learning, (17 more...)

Country:

North America > United States > Hawaii (0.34)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Freund, Yoav, Haussler, David

Unsupervised learning of distributions on binary vectors using two layer networks

We study a particular type of Boltzmann machine with a bipartite graph structure called a harmonium. Ourinterest is in using such a machine to model a probability distribution on binary input vectors. We analyze the class of probability distributions that can be modeled by such machines.

artificial intelligence, harmonium model, machine learning, (17 more...)

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)