AITopics

Country:

Asia > Japan > Shikoku > Kōchi Prefecture > Kochi (0.05)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.05)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Kowalczyk, Adam, Ferrá, Herman L.

MLP Can Provably Generalize Much Better than VC-bounds Indicate

It is also shown that bounds following the true learning curve can be derived from a formalism based on the density of error patterns.

perceptron, sequence, thermodynamic limit, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Oceania > Australia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.33)

For Valid Generalization the Size of the Weights is More Important than the Size of the Network

Bartlett, Peter L.

Baum and Haussler [4] used these results to give sample size bounds for multi-layer threshold networks Generalization and the Size of the Weights in Neural Networks 135 that grow at least as quickly as the number of weights (see also [7]). However, for pattern classification applications the VC-bounds seem loose; neural networks often perform successfully with training sets that are considerably smaller than the number of weights. This paper shows that for classification problems on which neural networks perform well, if the weights are not too big, the size of the weights determines the generalization performance. In contrast with the function classes and algorithms considered in the VC-theory, neural networks used for binary classification problems have real-valued outputs, and learning algorithms typically attempt to minimize the squared error of the network output over a training set. As well as encouraging the correct classification, this tends to push the output away from zero and towards the target values of { -1, I}.

dimension, fat-shattering dimension, misclassification probability, (15 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

For Valid Generalization the Size of the Weights is More Important than the Size of the Network

Bartlett, Peter L.

Baum and Haussler [4] used these results to give sample size bounds for multi-layer threshold networks Generalization and the Size ofthe Weights in Neural Networks 135 that grow at least as quickly as the number of weights (see also [7]). However, for pattern classification applications the VC-bounds seem loose; neural networks often perform successfully with training sets that are considerably smaller than the number of weights. This paper shows that for classification problems on which neural networksperform well, if the weights are not too big, the size of the weights determines the generalization performance. In contrast with the function classes and algorithms considered in the VC-theory, neural networks used for binary classification problems have real-valued outputs, and learning algorithms typically attempt to minimize the squared error of the network output over a training set. As well as encouraging the correct classification, this tends to push the output away from zero and towards the target values of { -1, I}.

dimension, fat-shattering dimension, misclassification probability, (15 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Interpolating Earth-science Data using RBF Networks and Mixtures of Experts

Wan, Ernest, Bone, Don

We present a mixture of experts (ME) approach to interpolate sparse, spatially correlated earth-science data. Kriging is an interpolation method which uses a global covariation model estimated from the data to take account of the spatial dependence in the data. Based on the close relationship between kriging and the radial basis function (RBF) network (Wan & Bone, 1996), we use a mixture of generalized RBF networks to partition the input space into statistically correlated regions and learn the local covariation model of the data in each region. Applying the ME approach to simulated and real-world data, we show that it is able to achieve good partitioning of the input space, learn the local covariation models and improve generalization.

artificial intelligence, covariation model, machine learning, (15 more...)

Country: Oceania > Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Peper, Ferdinand, Noda, Hideki

Hebb Learning of Features based on their Information Content

This paper investigates the stationary points of a Hebb learning rule with a sigmoid nonlinearity in it. We show mathematically that when the input has a low information content, as measured by the input's variance, this learning rule suppresses learning, that is, forces the weight vector to converge to the zero vector. When the information content exceeds a certain value, the rule will automatically begin to learn a feature in the input. Our analysis suggests that under certain conditions it is the first principal component that is learned. The weight vector length remains bounded, provided the variance of the input is finite. Simulations confirm the theoretical results derived.

artificial intelligence, machine learning, neuron, (16 more...)

Country:

Oceania (0.47)
Asia > Japan (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Nevill-Manning, C. G., Witten, I. H.

Identifying Hierarchical Structure in Sequences: A linear-time algorithm

Journal of Artificial Intelligence ResearchSep-1-1997

SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences.

algorithm, grammar, sequence, (16 more...)

doi: 10.1613/jair.374

AI Access Foundation

10192

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Calendar of Events

AAAI,

AI MagazineMar-15-1997

Spring AI related meetings, conferences, workshops, and symposia.

artificial intelligence, deadline, fax, (16 more...)

AI Magazine

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
(43 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.70)

Wilson, D. R., Martinez, T. R.

Improved Heterogeneous Distance Functions

Journal of Artificial Intelligence ResearchJan-1-1997

Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes.

accuracy, dataset, distance function, (14 more...)

doi: 10.1613/jair.346

AI Access Foundation

10182

Country:

North America > United States > California > Orange County > Irvine (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Campbell, Peter K., Dale, Michael, Ferrá, Herman L., Kowalczyk, Adam

Experiments with Neural Networks for Real Time Implementation of Control

Neural Information Processing SystemsDec-31-1996

This paper describes a neural network based controller for allocating capacity in a telecommunications network. This system was proposed in order to overcome a "real time" response constraint. Two basic architectures are evaluated: 1) a feedforward network-heuristic and; 2) a feedforward network-recurrent network. These architectures are compared against a linear programming (LP) optimiser as a benchmark. This LP optimiser was also used as a teacher to label the data samples for the feedforward neural network training algorithm. It is found that the systems are able to provide a traffic throughput of 99% and 95%, respectively, of the throughput obtained by the linear programming solution. Once trained, the neural network based solutions are found in a fraction of the time required by the LP optimiser.

link violation, neural network, sdh network, (12 more...)