Reviews: Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

Oct-7-2024, 22:26:04 GMT–Neural Information Processing Systems

This paper describes an extension to the PILCO algorithm (Probabilistic Inference and Learning for COntrol, a data-efficient reinforcement algorithm). The proposed algorithm applies a measurement filtering algorithm during the actual experiment and explicitly takes this measurement filtering algorithm into account during the policy learning step, which uses data from the experiment. This is an important practical extension addressing the fact that measurements are often very noisy. My intuitive explanation for this approach is that the proposed approach makes the overall feedback system more "repeatable" (noise is mostly filtered out) and therefore learning is faster (given that the filtering is effective, see last sentence of the conclusion). The paper presents detailed mathematical derivations and strong simulation results that highlight the properties of the proposed algorithm.

algorithm, continuous state-action gaussian-pomdp, data-efficient reinforcement learning, (5 more...)

Neural Information Processing Systems

Oct-7-2024, 22:26:04 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.40)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.40)