Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Dec-31-2003–Neural Information Processing Systems

A longstanding goal of reinforcement learning is to develop nonparametric representationsof policies and value functions that support rapid learning without suffering from interference or the curse of dimensionality. Wehave developed a trajectory-based approach, in which policies and value functions are represented nonparametrically along trajectories. Thesetrajectories, policies, and value functions are updated as the value function becomes more accurate or as a model of the task is updated. Wehave applied this approach to periodic tasks such as hopping and walking, which required handling discount factors and discontinuities inthe task dynamics, and using function approximation to represent value functions at discontinuities. We also describe extensions of the approach tomake the policies more robust to modeling error and sensor noise.

discontinuity, trajectory, value function, (16 more...)

Neural Information Processing Systems

Dec-31-2003

Conferences PDF

Add feedback

Country:
- North America > United States
  - New Jersey (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.14)
  - New York > New York County
    - New York City (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.05)
- Asia > Japan
  - Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning
    - Uncertainty (0.71)
    - Optimization (0.69)

Duplicate Docs Excel Report

Title
Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach
Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Similar Docs Excel Report more

Title	Similarity	Source
None found