AITopics

This paper proposes a practical optimization method for layered neural networks, by which the optimal model and parameter can be found simultaneously. 'i\Te modify the conventional information criterion into a differentiable function of parameters, and then, minimize it, while controlling it back to the ordinary form. Effectiveness of this method is discussed theoretically and experimentally.

optimal model, optimization method, prediction error, (10 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data

Utans, Joachim

Model-based object recognition solves the problem of invariant recognition by relying on stored prototypes at unit scale positioned at the origin of an object-centered coordinate system. Elastic matching techniques are used to find a correspondence between features of the stored model and the data and can also compute the parameters of the transformation the observed instance has undergone relative to the stored model.

hierarchy, node, substructure, (14 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)
Asia > Middle East > Jordan (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Leerink, Laurens R., Jabri, Marwan A.

Constructive Learning Using Internal Representation Conflicts

The first class of network adaptation algorithms start out with a redundant architecture and proceed by pruning away seemingly unimportant weights (Sietsma and Dow, 1988; Le Cun et aI, 1990). A second class of algorithms starts off with a sparse architecture and grows the network to the complexity required by the problem. Several algorithms have been proposed for growing feedforward networks. The upstart algorithm of Frean (1990) and the cascade-correlation algorithm of Fahlman (1990) are examples of this approach.

algorithm, context unit, weight update value, (13 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.06)
Asia > Middle East > Jordan (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Bostock, Richard T. J., Harget, Alan J.

A Comparative Study of a Modified Bumptree Neural Network with Radial Basis Function Networks and the Standard Multi Layer Perceptron

Bumptrees are geometric data structures introduced by Omohundro (1991) to provide efficient access to a collection of functions on a Euclidean space of interest. We describe a modified bumptree structure that has been employed as a neural network classifier, and compare its performance on several classification tasks against that of radial basis function networks and the standard mutIi-Iayer perceptron. 1 INTRODUCTION A number of neural network studies have demonstrated the utility of the multi-layer perceptron (MLP) and shown it to be a highly effective paradigm. Studies have also shown, however, that the MLP is not without its problems, in particular it requires an extensive training time, is susceptible to local minima problems and its perfonnance is dependent upon its internal network architecture. In an attempt to improve upon the generalisation performance and computational efficiency a number of studies have been undertaken principally concerned with investigating the parametrisation of the MLP. It is well known, for example, that the generalisation performance of the MLP is affected by the number of hidden units in the network, which have to be determined empirically since theory provides no guidance.

bumptree, problem space, rbf network, (12 more...)

Country:

Europe > United Kingdom > England (0.05)
Europe > Netherlands > North Brabant > Eindhoven (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Ginzburg, Iris, Horn, David

Combined Neural Networks for Time Series Analysis

We propose a method for improving the performance of any network designed to predict the next value of a time series. Vve advocate analyzing the deviations of the network's predictions from the data in the training set. This can be carried out by a secondary network trained on the time series of these residuals. The combined system of the two networks is viewed as the new predictor. We demonstrate the simplicity and success of this method, by applying it to the sunspots data. The small corrections of the secondary network can be regarded as resulting from a Taylor expansion of a complex network which includes the combined system.

prediction, primary network, secondary network, (14 more...)

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.42)

Ron, Dana, Singer, Yoram, Tishby, Naftali

The Power of Amnesia

We propose a learning algorithm for a variable memory length Markov process. Human communication, whether given as text, handwriting, or speech, has multi characteristic time scales. On short scales it is characterized mostly by the dynamics that generate the process, whereas on large scales, more syntactic and semantic information is carried. For that reason the conventionally used fixed memory Markov models cannot capture effectively the complexity of such structures. On the other hand using long memory models uniformly is not practical even for as short memory as four.

algorithm, automaton, probability, (14 more...)

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Tresp, Volker, Ahmad, Subutai, Neuneier, Ralph

Training Neural Networks with Deficient Data

We analyze how data with uncertain or missing input features can be incorporated into the training of a neural network. The general solution requires a weighted integration over the unknown or uncertain input although computationally cheaper closed-form solutions can be found for certain Gaussian Basis Function (GBF) networks. We also discuss cases in which heuristical solutions such as substituting the mean of an unknown input can be harmful.

incomplete pattern, training neural network, tresp, (14 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
Asia > Middle East > Jordan (0.05)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ghahramani, Zoubin, Jordan, Michael I.

Supervised learning from incomplete data via an EM approach

Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data set.s. VVe use mixture models for the density estimates and make two distinct appeals to the Expectation Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm-EM is used both for the estimation of mixture components and for coping wit.h missing dat.a. The resulting algorithm is applicable t.o a wide range of supervised as well as unsupervised learning problems.

algorithm, gaussian, incomplete data, (13 more...)

Country:

Asia > Middle East > Jordan (0.16)
North America > United States > New York (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)

Saunders, Gregory M., Angeline, Peter J., Pollack, Jordan B.

Structural and Behavioral Evolution of Recurrent Networks

This paper introduces GNARL, an evolutionary program which induces recurrent neural networks that are structurally unconstrained. In contrast to constructive and destructive algorithms, GNARL employs a population of networks and uses a fitness function's unsupervised feedback to guide search through network space. Annealing is used in generating both gaussian weight changes and structural modifications. Applying GNARL to a complex search and collection task demonstrates that the system is capable of inducing networks with complex internal dynamics.

morgan kaufmann, neural network, structural and behavioral evolution, (13 more...)

Country:

North America > United States > Ohio > Franklin County > Columbus (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Bengio, Yoshua, Frasconi, Paolo

Credit Assignment through Time: Alternatives to Backpropagation

Learning to recognize or predict sequences using long-term context has many applications. However, practical and theoretical problems are found in training recurrent neural networks to perform tasks in which input/output dependencies span long intervals. Starting from a mathematical analysis of the problem, we consider and compare alternative algorithms and architectures on tasks for which the span of the input/output dependencies can be controlled. Results on the new algorithms show performance qualitatively superior to that obtained with backpropagation. 1 Introduction Recurrent neural networks have been considered to learn to map input sequences to output sequences. Machines that could efficiently learn such tasks would be useful for many applications involving sequence prediction, recognition or production. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. In fact, we can prove that dynamical systems such as recurrent neural networks will be increasingly difficult to train with gradient descent as the duration of the dependencies to be captured increases. A mathematical analysis of the problem shows that either one of two conditions arises in such systems.

algorithm, information, sequence, (13 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
Asia > Middle East > Jordan (0.05)
Europe > Italy (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)