AITopics | Bradtke, Steven J.

Plotting

Bradtke, Steven J.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning Methods for Continuous-Time Markov Decision Problems

Bradtke, Steven J., Duff, Michael O.

Neural Information Processing SystemsDec-31-1995

Semi-Markov Decision Problems are continuous time generalizations ofdiscrete time Markov Decision Problems. A number of reinforcement learning algorithms have been developed recently for the solution of Markov Decision Problems, based on the ideas of asynchronous dynamic programming and stochastic approximation. Amongthese are TD(,x), Q-Iearning, and Real-time Dynamic Programming. After reviewing semi-Markov Decision Problems and Bellman's optimality equation in that context, we propose algorithms similarto those named above, adapted to the solution of semi-Markov Decision Problems. We demonstrate these algorithms by applying them to the problem of determining the optimal control fora simple queueing system. We conclude with a discussion of circumstances under which these algorithms may be usefully applied. 1 Introduction A number of reinforcement learning algorithms based on the ideas of asynchronous dynamic programming and stochastic approximation have been developed recently for the solution of Markov Decision Problems.

artificial intelligence, decision problem, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning Methods for Continuous-Time Markov Decision Problems

Bradtke, Steven J., Duff, Michael O.

Neural Information Processing SystemsDec-31-1995

Semi-Markov Decision Problems are continuous time generalizations of discrete time Markov Decision Problems. A number of reinforcement learning algorithms have been developed recently for the solution of Markov Decision Problems, based on the ideas of asynchronous dynamic programming and stochastic approximation. Among these are TD(,x), Q-Iearning, and Real-time Dynamic Programming. After reviewing semi-Markov Decision Problems and Bellman's optimality equation in that context, we propose algorithms similar to those named above, adapted to the solution of semi-Markov Decision Problems. We demonstrate these algorithms by applying them to the problem of determining the optimal control for a simple queueing system. We conclude with a discussion of circumstances under which these algorithms may be usefully applied.

artificial intelligence, decision problem, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning Applied to Linear Quadratic Regulation

Bradtke, Steven J.

Neural Information Processing SystemsDec-31-1993

Recent research on reinforcement learning has focused on algorithms based on the principles of Dynamic Programming (DP). One of the most promising areas of application for these algorithms is the control of dynamical systems, and some impressive results have been achieved. However, there are significant gaps between practice and theory. In particular, there are no con vergence proofs for problems with continuous state and action spaces, or for systems involving nonlinear function approximators (such as multilayer perceptrons). This paper presents research applying DPbased reinforcement learning theory to Linear Quadratic Regulation (LQR), an important class of control problems involving continuous state and action spaces and requiring a simple type of nonlinear function approximator. We describe an algorithm based on Q-Iearning that is proven to converge to the optimal controller for a large class of LQR problems. We also describe a slightly different algorithm that is only locally convergent to the optimal Q-function, demonstrating one of the possible pitfalls of using a nonlinear function approximator with DPbased learning.

algorithm, artificial intelligence, neural network, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Reinforcement Learning Applied to Linear Quadratic Regulation

Bradtke, Steven J.

Neural Information Processing SystemsDec-31-1993

Recent research on reinforcement learning has focused on algorithms basedon the principles of Dynamic Programming (DP). One of the most promising areas of application for these algorithms isthe control of dynamical systems, and some impressive results have been achieved. However, there are significant gaps between practice and theory. In particular, there are no con vergence proofsfor problems with continuous state and action spaces, or for systems involving nonlinear function approximators (such as multilayer perceptrons). This paper presents research applying DPbased reinforcement learning theory to Linear Quadratic Regulation (LQR),an important class of control problems involving continuous state and action spaces and requiring a simple type of nonlinear function approximator. We describe an algorithm based on Q-Iearning that is proven to converge to the optimal controller for a large class of LQR problems. We also describe a slightly different algorithm that is only locally convergent to the optimal Q-function, demonstrating one of the possible pitfalls of using a nonlinear function approximator with DPbased learning.

algorithm, artificial intelligence, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback