AITopics

We are concerned with the problem of the number of nodes needed in a feedforward neural network in order to represent a fUllction to within a specified accuracy.

complexity, neural network, representation, (12 more...)

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Middle East > Republic of Türkiye > Ordu Province > Ordu (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Krogh, Anders, Hertz, John A.

Dynamics of Generalization in Linear Perceptrons

We study the evolution of the generalization ability of a simple linear perceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; the relaxation time is ex (1 - y'a)-2.

generalization, generalization ability, perfect generalization, (12 more...)

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.84)

Generalization Dynamics in LMS Trained Linear Networks

Chauvin, Yves

Recent progress in network design demonstrates that nonlinear feedforward neural networks can perform impressive pattern classification for a variety of real-world applications (e.g., Le Cun et al., 1990; Waibel et al., 1989). Various simulations and relationships between the neural network and machine learning theoretical literatures also suggest that too large a number of free parameters ("weight overfitting") could substantially reduce generalization performance.

error component, validation dynamic, validation error, (15 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Germany > Berlin (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Weigend, Andreas S., Rumelhart, David E., Huberman, Bernardo A.

Generalization by Weight-Elimination with Application to Forecasting

Inspired by the information theoretic idea of minimum description length, we add a term to the back propagation cost function that penalizes network complexity. We give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. We use this procedure to predict the sunspot time series and the notoriously noisy series of currency exchange rates. 1 INTRODUCTION Learning procedures for connectionist networks are essentially statistical devices for performing inductive inference. There is a tradeoff between two goals: on the one hand, we want such devices to be as general as possible so that they are able to learn a broad range of problems.

complexity term, rumelhart, weight-elimination, (16 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Government (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

On Stochastic Complexity and Admissible Models for Neural Network Classifiers

Smyth, Padhraic

For a detailed rationale the reader is referred to the work of Rissanen (1984) or Wallace and Freeman (1987) and the references therein. Note that the Minimum Description Length (MDL) technique (as Rissanen's approach has become known) is implicitly related to Maximum A Posteriori (MAP) Bayesian estimation techniques if cast in the appropriate framework.

admissible model, classification problem, description length, (13 more...)

Country:

North America > United States > New York (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Keesing, Ron, Stork, David G.

Evolution and Learning in Neural Networks: The Number and Distribution of Learning Trials Affect the Rate of Evolution

Learning can increase the rate of evolution of a population of biological organisms (the Baldwin effect). Our simulations show that in a population of artificial neural networks solving a pattern recognition problem, no learning or too much learning leads to slow evolution of the genes whereas an intermediate amount is optimal. Moreover, for a given total number of training presentations, fastest evoution occurs if different individuals within each generation receive different numbers of presentations, rather than equal numbers. Because genetic algorithms (GAs) help avoid local minima in energy functions, our hybrid learning-GA systems can be applied successfully to complex, highdimensional pattern recognition problems. INTRODUCTION The structure and function of a biological network derives from both its evolutionary precursors and real-time learning.

evolution, fitness, learning, (14 more...)

Country:

North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Bottou, Léon, Gallinari, Patrick

A Framework for the Cooperation of Learning Algorithms

We introduce a framework for training architectures composed of several modules. This framework, which uses a statistical formulation of learning systems, provides a unique formalism for describing many classical connectionist algorithms as well as complex systems where several algorithms interact. It allows to design hybrid systems which combine the advantages of connectionist algorithms as well as other learning algorithms.

algorithm, architecture, module, (13 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.30)

Girosi, Federico, Poggio, Tomaso, Caprile, Bruno

Extensions of a Theory of Networks for Approximation and Learning: Outliers and Negative Examples

Learning an input-output mapping from a set of examples can be regarded as synthesizing an approximation of a multidimensional function.

extension, girosi, negative example, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > District of Columbia > Washington (0.04)
Europe > Italy (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.44)

Kadirkamanathan, V., Niranjan, M., Fallside, F.

Sequential Adaptation of Radial Basis Function Neural Networks and its Application to Time-series Prediction

We develop a sequential adaptation algorithm for radial basis function (RBF) neural networks of Gaussian nodes, based on the method of successive F-Projections.

adaptation algorithm, algorithm, rbf network, (12 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hanson, Stephen Jose, Gluck, Mark A.

Spherical Units as Dynamic Consequential Regions: Implications for Attention, Competition and Categorization

Spherical Units can be used to construct dynamic reconfigurable consequential regions, the geometric bases for Shepard's (1987) theory of stimulus generalization in animals and humans. We derive from Shepard's (1987) generalization theory a particular multi-layer network with dynamic (centers and radii) spherical regions which possesses a specific mass function (Cauchy). This learning model generalizes the configural-cue network model (Gluck & Bower 1988): (1) configural cues can be learned and do not require pre-wiring the power-set of cues, (2) Consequential regions are continuous rather than discrete and (3) Competition amoungst receptive fields is shown to be increased by the global extent of a particular mass function (Cauchy). We compare other common mass functions (Gaussian; used in models of Moody & Darken; 1989, Krushke, 1990) or just standard backpropogation networks with hyperplane/logistic hidden units showing that neither fare as well as models of human generalization and learning.

consequential region, hypothesis distribution, shepard, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)