Learning Factored Representations for Partially Observable Markov Decision Processes

Dec-31-2000–Neural Information Processing Systems

The problem of reinforcement learning in a non-Markov environment is explored using a dynamic Bayesian network, where conditional independence assumptionsbetween random variables are compactly represented by network parameters. The parameters are learned online, and approximations areused to perform inference and to compute the optimal value function. The relative effects of inference and value function approximations onthe quality of the final policy are investigated, by learning to solve a moderately difficult driving task. The two value function approximations, linearand quadratic, were found to perform similarly, but the quadratic model was more sensitive to initialization. Both performed below thelevel of human performance on the task.

approximation, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Dec-31-2000

Conferences PDF

Add feedback

Country:
- North America
  - Canada > Ontario
    - Toronto (0.15)
  - United States > New York (0.15)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (1.00)
    - Undirected Networks > Markov Models (1.00)
  - Representation & Reasoning > Uncertainty (1.00)

Duplicate Docs Excel Report

Title
Learning Factored Representations for Partially Observable Markov Decision Processes
Learning Factored Representations for Partially Observable Markov Decision Processes

Similar Docs Excel Report more

Title	Similarity	Source
None found