AITopics

An adaptive online algorithm is proposed to estimate hierarchical data structures for non-stationary data sources. The approach is based on the principle of minimum cross entropy to derive a decision tree for data clustering and it employs a metalearning idea (learning to learn) to adapt to changes in data characteristics. Its efficiency is demonstrated by grouping non-stationary artifical data and by hierarchical segmentation of LANDSAT images. 1 Introduction Unsupervised learning addresses the problem to detect structure inherent in unlabeled and unclassified data. N. The encoding usually is represented by an assignment matrix M (Mia), where Mia 1 if and only if Xi belongs to cluster L: 1 MiaV (Xi, Ya) measures the quality of a data partition, Le., optimal assignments and prototypes (M,y)OPt argminM,y1i (M,Y) minimize the inhomogeneity of clusters w.r.t. a given distance measure V. For reasons of simplicity we restrict the presentation to the ' sum-of-squared-error criterion V(x, y) To facilitate this minimization a deterministic annealing approach was proposed in [5] signments, which maps the discrete optimization problem, i.e. how to determine the data as via the Maximum Entropy Principle [2] to a continuous parameter es- Unsupervised Online Learning of Decision Trees for Data Analysis 515 timation problem.

data source, learning, prototype, (13 more...)

Country:

Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Genre: Instructional Material > Online (0.41)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.35)

Sollich, Peter, Barber, David

On-line Learning from Finite Training Sets in Nonlinear Networks

Online learning is one of the most common forms of neural network training. We present an analysis of online learning from finite training sets for nonlinear networks (namely, soft-committee machines), advancing the theory to more realistic learning scenarios. Dynamical equations are derived for an appropriate set of order parameters; these are exact in the limiting case of either linear networks or infinite training sets. Preliminary comparisons with simulations suggest that the theory captures some effects of finite training sets, but may not yet account correctly for the presence of local minima.

equation, infinite training, order parameter, (14 more...)

Country: Europe > United Kingdom (0.04)

Genre: Instructional Material > Online (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.94)

Rattray, Magnus, Saad, David

Globally Optimal On-line Learning Rules

We present a method for determining the globally optimal online learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule. 1 Introduction We consider a learning scenario in which a feed-forward neural network model (the student) emulates an unknown mapping (the teacher), given a set of training examples produced by the teacher. The performance of the student network is typically measured by its generalization error, which is the expected error on an unseen example. The aim of training is to reduce the generalization error by adapting the student network's parameters appropriately. A common form of training is online learning, where training patterns are presented sequentially and independently to the network at each learning step.

algorithm, generalization error, optimal rule, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Instructional Material > Online (0.40)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Leen, Todd K., Schottky, Bernhard, Saad, David

Two Approaches to Optimal Annealing

We employ both master equation and order parameter approaches to analyze the asymptotic dynamics of online learning with different learning rate annealing schedules. We examine the relations between the results obtained by the two approaches and obtain new results on the optimal decay coefficients and their dependence on the number of hidden nodes in a two layer architecture.

equation, generalization error, order parameter approach, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Oregon (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Industry: Education > Educational Setting (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Milostan, Jeanne C., Cottrell, Garrison W.

Serial Order in Reading Aloud: Connectionist Models and Neighborhood Structure

Each chart is primarily divided into the three major noise conditions, cf.

neighborhood, pronunciation, regression, (13 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(9 more...)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Landauer, Thomas K., Laham, Darrell, Foltz, Peter W.

Learning Human-like Knowledge by Singular Value Decomposition: A Progress Report

Singular value decomposition (SVD) can be viewed as a method for unsupervised training of a network that associates two classes of events reciprocally by linear connections through a single hidden layer. SVD was used to learn and represent relations among very large numbers of words (20k-60k) and very large numbers of natural text passages (lk-70k) in which they occurred. The result was 100-350 dimensional "semantic spaces" in which any trained or newly aibl word or passage could be represented as a vector, and similarities were measured by the cosine of the contained angle between vectors. Good accmacy in simulating human judgments and behaviors has been demonstrated by performance on multiple-choice vocabulary and domain knowledge tests, emulation of expert essay evaluations, and in several other ways. Examples are also given of how the kind of knowledge extracted by this method can be applied.

knowledge, landauer, similarity, (13 more...)

Country:

North America > Panama (0.14)
North America > United States > New York (0.04)
North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
(4 more...)

Industry:

Health & Medicine > Therapeutic Area (0.48)
Education > Assessment & Standards > Student Performance (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Schaal, Stefan, Vijayakumar, Sethu, Atkeson, Christopher G.

Local Dimensionality Reduction

Each chart is primarily divided into the three major noise conditions, cf.

dimensionality reduction, noise, regression, (15 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.05)
(5 more...)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Rattray, Magnus, Saad, David

Globally Optimal On-line Learning Rules

We present a method for determining the globally optimal online learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization the total reduction inerror was considered. We maximize the whole learning process and show howgeneralization error over the resulting rule can significantly outperform the locally optimal rule. 1 Introduction We consider a learning scenario in which a feed-forward neural network model (the an unknown mapping (the teacher), given a set of training examplesstudent) emulates The performance of the student network is typicallyproduced by the teacher. A common form of training is online learning, where training patterns are presented sequentially and independently to the network at each learning step. This form of training can be beneficial in terms of both storage and computation time, especially for large systems.

artificial intelligence, machine learning, optimal rule, (15 more...)

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)

Genre: Instructional Material > Online (0.40)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Lee, Daniel D., Seung, H. S.

A Neural Network Based Head Tracking System

We have constructed an inexpensive video based motorized tracking system that learns to track a head. It uses real time graphical user inputs or an auxiliary infrared detector as supervisory signals to train a convolutional neural network. The inputs to the neural network consist of normalized luminance and chrominance images and motion information from frame differences. Subsampled images are also used to provide scale invariance. During the online training phases the neural network rapidly adjusts the input weights depending up on the reliability of the different channels in the surrounding environment. This quick adaptation allows the system to robustly track a head even when other objects are moving within a cluttered background.

artificial intelligence, head tracking system, machine learning, (1 more...)

Industry: Education > Educational Setting > Online (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Schwenk, Holger, Bengio, Yoshua

Training Methods for Adaptive Boosting of Neural Networks

"Boosting" is a general method for improving the performance of any learning algorithm that consistently generates classifiers which need to perform only slightly better than random guessing. A recently proposed and very promising boosting algorithm is AdaBoost [5]. It has been applied withgreat success to several benchmark machine learning problems using rather simple learning algorithms [4], and decision trees [1, 2, 6]. In this paper we use AdaBoost to improve the performances of neural networks. We compare training methods based on sampling the training set and weighting the cost function. Our system achieves about 1.4% error on a data base of online handwritten digits from more than 200 writers. Adaptive boosting of a multi-layer network achieved 1.5% error on the UCI Letters and 8.1 % error on the UCI satellite data set.

artificial intelligence, classifier, machine learning, (18 more...)

Country: North America > Canada (0.15)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)