Tight performance bounds on greedy policies based on imperfect value functions

Feb-1-1993–Classics

Reinforcement learning is an effective technique for learning action policies in discrete stochastic environments, but its efficiency can decay exponentially with the size of the state space. In many situations significant portions of a large state space may be irrelevant to a specific goal and can be aggregated into a few, relevant, states. The U Tree algorithm generates a tree based state discretization that efficiently finds the relevant state chunks of large propositional domains. In this paper, we extend the U Tree algorithm to challenging domains with a continuous state space for which there is no initial discretization.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Classics

Feb-1-1993

Classics PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.14)
  - New Jersey > Mercer County
    - Princeton (0.04)
  - Massachusetts
    - Middlesex County > Cambridge (0.04)
    - Hampshire County > Amherst (0.04)
  - California > Monterey County
    - Monterey (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.68)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (0.96)
    - Decision Tree Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found