AITopics

We present a new approach to reinforcement learning in which the policies considered by the learning process are constrained by hierarchies of partially specified machines. This allows for the use of prior knowledge to reduce the search space and provides a framework in which knowledge can be transferred across problems and in which component solutions can be recombined to solve larger and more complicated problems. Our approach can be seen as providing a link between reinforcement learning and "behavior-based" or "teleo-reactive" approaches to control. We present provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrate their effectiveness on a problem with several thousand states.

choice point, optimal policy, reinforcement learning, (13 more...)

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(9 more...)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Adaptive Choice of Grid and Time in Reinforcement Learning

Pareigis, Stephan

Consistency problems arise if the discretization needs to be refined, e.g. for more accuracy, application of multi-grid iteration or better starting values for the iteration of the approximate optimal value function. In [7] it was shown, that for diffusion dominated problems, a state to time discretization ratio k/ h of Ch'r, I

discretization, optimal value function, refinement, (16 more...)

Country: Europe > Germany > Schleswig-Holstein > Kiel (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.44)

Munos, Rémi, Bourgine, Paul

Reinforcement Learning for Continuous Stochastic Control Problems

Here we sudy the continuous time, continuous state-space stochastic case, which covers a wide variety of control problems including target, viability, optimization problems (see [FS93], [KP95])}or which a formalism is the following.

algorithm, equation, reinforcement learning, (11 more...)

Country:

Europe > France (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Nonparametric Model-Based Reinforcement Learning

Atkeson, Christopher G.

This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectory optimizers can be used effectively with learned nonparametric models. We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model.

dynamic programming, trajectory, value function, (13 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)

Zimmermann, Hans-Georg, Neuneier, Ralph

The Observer-Observation Dilemma in Neuro-Forecasting

We explain how the training data can be separated into clean information and unexplainable noise. Analogous to the data, the neural network is separated into a time invariant structure used for forecasting, and a noisy part. We propose a unified theory connecting the optimization algorithms for cleaning and learning together with algorithms that control the data noise and the parameter noise. The combined algorithm allows a data-driven local control of the liability of the network parameters and therefore an improvement in generalization. The approach is proven to be very useful at the task of forecasting the German bond market.

algorithm, neural network, noise, (15 more...)

Country: Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)

Industry: Banking & Finance > Trading (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Modelling Seasonality and Trends in Daily Rainfall Data

Williams, Peter M.

This paper presents a new approach to the problem of modelling daily rainfall using neural networks. We first model the conditional distributions of rainfall amounts, in such a way that the model itself determines the order of the process, and the time-dependent shape and scale of the conditional distributions. After integrating over particular weather patterns, we are able to extract seasonal variations and long-term trends. 1 Introduction Analysis of rainfall data is important for many agricultural, ecological and engineering activities. Design of irrigation and drainage systems, for instance, needs to take account not only of mean expected rainfall, but also of rainfall volatility. Estimates of crop yields also depend on the distribution of rainfall during the growing season, as well as on the overall amount.

conditional distribution, modelling seasonality and trend, probability, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Industry: Food & Agriculture > Agriculture (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Verrelst, Herman, Moreau, Yves, Vandewalle, Joos, Timmerman, Dirk

Use of a Multi-Layer Perceptron to Predict Malignancy in Ovarian Tumors

ca 125, prediction, roccurve, (15 more...)

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
North America > United States > New York (0.05)
Europe > France (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.47)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.86)

Tresp, Volker, Briegel, Thomas

A Solution for Missing Data in Recurrent Neural Networks with an Application to Blood Glucose Prediction

We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difficulties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a specific combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the real-time recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forward-backward Kalman filter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals.

error model, linear error model, neural network, (13 more...)

Country: Europe > Germany (0.04)

Genre: Research Report > Promising Solution (0.40)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Sykacek, Peter, Dorffner, Georg, Rappelsberger, Peter, Zeitlhofer, Josef

Experiences with Bayesian Learning in a Real World Application

This paper reports about an application of Bayes' inferred neural network classifiers in the field of automatic sleep staging. The reason for using Bayesian learning for this task is twofold. First, Bayesian inference is known to embody regularization automatically. Second, a side effect of Bayesian learning leads to larger variance of network outputs in regions without training data. This results in well known moderation effects, which can be used to detect outliers. In a 5 fold cross-validation experiment the full Bayesian solution found with R. Neals hybrid Monte Carlo algorithm, was not better than a single maximum a-posteriori (MAP) solution found with D.J. MacKay's evidence approximation. In a second experiment we studied the properties of both solutions in rejecting classification of movement artefacts.

bayesian inference, classifier, experiment, (14 more...)