Algorithms for Learning Markov Field Policies

Mar-14-2024, 13:48:51 GMT–Neural Information Processing Systems

We use a graphical model for representing policies in Markov Decision Processes. This new representation can easily incorporate domain knowledge in the form of a state similarity graph that loosely indicates which states are supposed to have similar optimal actions. A bias is then introduced into the policy search process by sampling policies from a distribution that assigns high probabilities to policies that agree with the provided state similarity graph, i.e. smoother policies.

bellman error, learning, value function, (14 more...)

Neural Information Processing Systems

Mar-14-2024, 13:48:51 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe > Germany
  - Hesse > Darmstadt Region
    - Darmstadt (0.04)
  - Baden-Württemberg > Tübingen Region
    - Tübingen (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (0.94)
  - Representation & Reasoning > Optimization (0.68)
  - Machine Learning
    - Statistical Learning (1.00)
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)