Approximate Solutions to Optimal Stopping Problems

Dec-31-1997–Neural Information Processing Systems

We propose and analyze an algorithm that approximates solutions to the problem of optimal stopping in a discounted irreducible aperiodic Markov chain. The scheme involves the use of linear combinations of fixed basis functions to approximate a Q-function. The weights of the linear combination are incrementally updated through an iterative process similar to Q-Iearning, involving simulation of the underlying Markov chain. Due to space limitations, we only provide an overview of a proof of convergence (with probability 1) and bounds on the approximation error. This is the first theoretical result that establishes the soundness of a Q-Iearninglike algorithm when combined with arbitrary linear function approximators to solve a sequential decision problem.

algorithm, artificial intelligence, survey article, (17 more...)

Neural Information Processing Systems

Dec-31-1997

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.14)
- North America > United States
  - Massachusetts > Middlesex County > Cambridge (0.15)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.57)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
Approximate Solutions to Optimal Stopping Problems
Approximate Solutions to Optimal Stopping Problems

Similar Docs Excel Report more

Title	Similarity	Source
None found