Algorithms for Learning Markov Field Policies
–Neural Information Processing Systems
We use a graphical model for representing policies in Markov Decision Processes. This new representation can easily incorporate domain knowledge in the form of a state similarity graph that loosely indicates which states are supposed to have similar optimal actions. A bias is then introduced into the policy search process by sampling policies from a distribution that assigns high probabilities to policies that agree with the provided state similarity graph, i.e. smoother policies.
Neural Information Processing Systems
Mar-14-2024, 13:48:51 GMT
- Country:
- Europe > Germany
- Baden-Württemberg > Tübingen Region
- Tübingen (0.04)
- Hesse > Darmstadt Region
- Darmstadt (0.04)
- Baden-Württemberg > Tübingen Region
- North America > United States (0.04)
- Europe > Germany
- Technology: