AITopics

Country: Europe > Germany (0.34)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

Inst., 19600 NW vonNeumann Dr, Beaverton, OR 97006 Abstract We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian),which does not require to even calculate the Hessian. Severalother applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive partof neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, eigenvalue, machine learning, (14 more...)

Country:

North America > United States > Oregon > Washington County > Beaverton (0.24)
North America > United States > Massachusetts > Hampshire County (0.14)

Industry: Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Chang, Eric I., Lippmann, Richard P.

A Boundary Hunting Radial Basis Function Classifier which Allocates Centers Constructively

A new boundary hunting radial basis function (BH-RBF) classifier which allocates RBF centers constructively near class boundaries is described. This classifier creates complex decision boundaries only in regions where confusions occur and corresponding RBF outputs are similar. A predicted square error measure is used to determine how many centers to add and to determine when to stop adding centers. Two experiments are presented which demonstrate the advantages of the BH RBF classifier. One uses artificial data with two classes and two input features where each class contains four clusters but only one cluster is near a decision region boundary.

artificial intelligence, classifier, machine learning, (12 more...)

Country: North America > United States (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Bonnlander, Brian V., Mozer, Michael C.

Metamorphosis Networks: An Alternative to Constructive Models

Given a set oftraining examples, determining the appropriate number offree parameters is a challenging problem. Constructive learning algorithms attempt to solve this problem automatically by adding hidden units, and therefore free parameters, during learning. Weexplore an alternative class of algorithms-called metamorphosis algorithms-inwhich the number of units is fixed, but the number of free parameters gradually increases during learning. The architecture we investigate is composed of RBF units on a lattice, whichimposes flexible constraints on the parameters of the network. Virtues of this approach include variable subset selection, robustparameter selection, multiresolution processing, and interpolation of sparse training data.

algorithm, artificial intelligence, machine learning, (18 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Simard, Patrice, LeCun, Yann, Denker, John S.

Efficient Pattern Recognition Using a New Transformation Distance

Memory-based classification algorithms such as radial basis functions orK-nearest neighbors typically rely on simple distances (Euclidean, dotproduct ...), which are not particularly meaningful on pattern vectors. More complex, better suited distance measures are often expensive and rather ad-hoc (elastic matching, deformable templates). We propose a new distance measure which (a) can be made locally invariant to any set of transformations of the input and (b) can be computed efficiently. We tested the method on large handwritten character databases provided by the Post Office and the NIST. Using invariances with respect to translation, rotation, scaling,shearing and line thickness, the method consistently outperformed all other systems tested on the same databases.

machine learning, pattern recognition, tangent distance, (16 more...)

Country: North America > United States (0.94)

Industry:

Government > Post Office (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.69)

Holographic Recurrent Networks

Plate, Tony A.

Holographic Recurrent Networks (HRNs) are recurrent networks which incorporate associative memory techniques for storing sequential structure.HRNs can be easily and quickly trained using gradient descent techniques to generate sequences of discrete outputs andtrajectories through continuous spaee. The performance of HRNs is found to be superior to that of ordinary recurrent networks onthese sequence generation tasks. 1 INTRODUCTION The representation and processing of data with complex structure in neural networks remains a challenge. In a previous paper [Plate, 1991b] I described Holographic Reduced Representations(HRRs) which use circular-convolution associative-memory to embody sequential and recursive structure in fixed-width distributed representations. Thispaper introduces Holographic Recurrent Networks (HRNs), which are recurrent nets that incorporate these techniques for generating sequences of symbols or trajectories through continuous space.

artificial intelligence, machine learning, sequence, (17 more...)

Country:

North America > United States (0.46)
North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Best-First Model Merging for Dynamic Learning and Recognition

Omohundro, Stephen M.

"Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting. It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate the approach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access.

artificial intelligence, best-first model merging, machine learning, (17 more...)

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Wettschereck, Dietrich, Dietterich, Thomas

Improving the Performance of Radial Basis Function Networks by Learning Center Locations

Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtaik task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn a non-Euclidean distance metric was found to reduce the error rate of RBF networks, while supervised learning of each center's variance resulted in inferior performance. The best improvement in accuracy was achieved by networks called generalized radial basis function (GRBF) networks. In GRBF, the center locations are determined by supervised learning. After training on 1000 words, RBF classifies 56.5% of letters correct, while GRBF scores 73.4% letters correct (on a separate test set). From these and other experiments, we conclude that supervised learning of center locations can be very important for radial basis function learning.

artificial intelligence, machine learning, supervised learning, (14 more...)

Country:

North America > United States > Oregon > Benton County > Corvallis (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Oregon > Washington County > Beaverton (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Setting (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Structural Risk Minimization for Character Recognition

Guyon, I., Vapnik, V., Boser, B., Bottou, L., Solla, S. A.

The method of Structural Risk Minimization refers to tuning the capacity of the classifier to the available amount of training data. This capacity is influenced by several factors, including: (1) properties of the input space, (2) nature and structure of the classifier, and (3) learning algorithm. Actions based on these three factors are combined here to control the capacity of linear classifiers and improve generalization on the problem of handwritten digit recognition.

classifier, linear classifier, structural risk minimization, (13 more...)

Country: North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Wettschereck, Dietrich, Dietterich, Thomas

Improving the Performance of Radial Basis Function Networks by Learning Center Locations

Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtaik task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn a non-Euclidean distance metric was found to reduce the error rate of RBF networks, while supervised learning of each center's variance resulted in inferior performance. The best improvement in accuracy was achieved by networks called generalized radial basis function (GRBF) networks. In GRBF, the center locations are determined by supervised learning. After training on 1000 words, RBF classifies 56.5% of letters correct, while GRBF scores 73.4% letters correct (on a separate test set). From these and other experiments, we conclude that supervised learning of center locations can be very important for radial basis function learning.

center location, supervised learning, variance, (11 more...)