AITopics

Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DPbased learning algorithms to the powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD("\) and Q-Iearning belong. 1 INTRODUCTION Learning to predict the future and to find an optimal way of controlling it are the basic goals of learning systems that interact with their environment. A variety of algorithms are currently being studied for the purposes of prediction and control in incompletely specified, stochastic environments. Here we consider learning algorithms defined in Markov environments. There are actions or controls (u) available for the learner that affect both the state transition probabilities, and the probability distribution for the immediate, state dependent costs (Ci(u)) incurred by the learner.

algorithm, convergence, theorem 1, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Neural Network Exploration Using Optimal Experiment Design

Cohn, David A.

Consider the problem of learning input/output mappings through exploration, e.g.

optimal experiment design, trajectory, variance, (10 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.06)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Flake, Gary W., Sun, Guo-Zhen, Lee, Yee-Chun

Exploiting Chaos to Control the Future

Recently, Ott, Grebogi and Yorke (OGY) [6] found an effective method to control chaotic systems to unstable fixed points by using only small control forces; however, OGY's method is based on and limited to a linear theory and requires considerable knowledge of the dynamics of the system to be controlled. In this paper we use two radial basis function networks: one as a model of an unknown plant and the other as the controller. The controller is trained with a recurrent learning algorithm to minimize a novel objective function such that the controller can locate an unstable fixed point and drive the system into the fixed point with no a priori knowledge of the system dynamics. Our results indicate that the neural controller offers many advantages over OGY's technique.

algorithm, controller, exploiting chaos, (14 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Buckland, Kenneth M., Lawrence, Peter D.

Transition Point Dynamic Programming

Transition point dynamic programming (TPDP) is a memorybased, reinforcement learning, direct dynamic programming approach to adaptive optimal control that can reduce the learning time and memory usage required for the control of continuous stochastic dynamic systems. TPDP does so by determining an ideal set of transition points (TPs) which specify only the control action changes necessary for optimal control. TPDP converges to an ideal TP set by using a variation of Q-Iearning to assess the merits of adding, swapping and removing TPs from states throughout the state space. When applied to a race track problem, TPDP learned the optimal control policy much sooner than conventional Q-Iearning, and was able to do so using less memory. 1 INTRODUCTION Dynamic programming (DP) approaches can be utilized to determine optimal control policies for continuous stochastic dynamic systems when the state spaces of those systems have been quantized with a resolution suitable for control (Barto et al., 1991). DP controllers, in lheir simplest form, are memory-based controllers that operate by repeatedly updating cost values associated with every state in the discretized state space (Barto et al., 1991).

buckland, q-iearning, tpdp, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Meilijson, Isaac, Ruppin, Eytan

Optimal Signalling in Attractor Neural Networks

It is well known that a given cortical neuron can respond with a different firing pattern for the same synaptic input, depending on its firing history and on the effects of modulator transmitters (see [Connors and Gutnick, 1990] for a review). The time span of different channel conductances is very broad, and the influence of some ionic currents varies with the history of the membrane potential [Lytton, 1991]. Motivated by the history-dependent nature of neuronal firing, we continue.our

iteration, neuron, similarity, (14 more...)

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States > Maryland (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Ginzburg, Iris, Sompolinsky, Haim

Correlation Functions in a Large Stochastic Neural Network

In many cases the crosscorrelations between the activities of cortical neurons are approximately symmetric about zero time delay. These have been taken as an indication of the presence of "functional connectivity" between the correlated neurons (Fetz, Toyama and Smith 1991, Abeles 1991). However, a quantitative comparison between the observed cross-correlations and those expected to exist between neurons that are part of a large assembly of interacting population has been lacking. Most of the theoretical studies of recurrent neural network models consider only time averaged firing rates, which are usually given as solutions of mean-field equations. They do not account for the fluctuations about these averages, the study of which requires going beyond the mean-field approximations. In this work we perform a theoretical study of the fluctuations in the neuronal activities and their correlations, in a large stochastic network of excitatory and inhibitory neurons. Depending on the model parameters, this system can exhibit coherent undamped oscillations. Here we focus on parameter regimes where the system is in a statistically stationary state, which is more appropriate for modeling non oscillatory neuronal activity in cortex. Our results for the magnitudes and the time-dependence of the correlation functions can provide a basis for comparison with physiological data on neuronal correlation functions.

correlation, inhibitory neuron, neuron, (13 more...)

Country:

Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.25)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States (0.04)
(2 more...)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Garzon, Max, Botelho, Fernanda

Observability of Neural Network Behavior

We prove that except possibly for small exceptional sets, discretetime analog neural nets are globally observable, i.e. all their corrupted pseudo-orbits on computer simulations actually reflect the true dynamical behavior of the network. Locally finite discrete (boolean) neural networks are observable without exception.

activation function, neural network, observability, (13 more...)

Country:

North America > United States > Tennessee > Shelby County > Memphis (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
(3 more...)

Industry:

Telecommunications > Networks (0.41)
Information Technology > Networks (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Coolen, A.C.C., Penney, R. W., Sherrington, D.

Coupled Dynamics of Fast Neurons and Slow Interactions

A simple model of coupled dynamics of fast neurons and slow interactions, modelling self-organization in recurrent neural networks, leads naturally to an effective statistical mechanics characterized by a partition function which is an average over a replicated system. This is reminiscent of the replica trick used to study spin-glasses, but with the difference that the number of replicas has a physical meaning as the ratio of two temperatures and can be varied throughout the whole range of real values. The model has interesting phase consequences as a function of varying this ratio and external stimuli, and can be extended to a range of other models. As the basic archetypal model we consider a system of Ising spin neurons (J'i E {-I, I}, i E {I,..., N}, interacting via continuous-valued symmetric interactions, Iij, which themselves evolve in response to the states of the neurons. JijO"iO"j (2) i j and the subscript {Jij} indicates that the {Jij} are to be considered as quenched variables.

fast neuron, interaction dynamic, transition, (11 more...)

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Solvable Models of Artificial Neural Networks

Watanabe, Sumio

Solvable models of nonlinear learning machines are proposed, and learning in artificial neural networks is studied based on the theory of ordinary differential equations. A learning algorithm is constructed, by which the optimal parameter can be found without any recursive procedure. The solvable models enable us to analyze the reason why experimental results by the error backpropagation often contradict the statistical learning theory.

differential equation, neural network, solvable model, (10 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Discontinuous Generalization in Large Committee Machines

Schwarze, H., Hertz, J.

The problem of learning from examples in multilayer networks is studied within the framework of statistical mechanics. Using the replica formalism we calculate the average generalization error of a fully connected committee machine in the limit of a large number of hidden units. If the number of training examples is proportional to the number of inputs in the network, the generalization error as a function of the training set size approaches a finite value. If the number of training examples is proportional to the number of weights in the network we find first-order phase transitions with a discontinuous drop in the generalization error for both binary and continuous weights. 1 INTRODUCTION Feedforward neural networks are widely used as nonlinear, parametric models for the solution of classification tasks and function approximation. Trained from examples of a given task, they are able to generalize, i.e. to compute the correct output for new, unknown inputs.

committee machine, generalization error, phase transition, (13 more...)

Country:

North America > United States > New York (0.04)
North America > United States > California > San Mateo County > Redwood City (0.04)
Europe > Sweden (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)