Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

Martin, John, Wang, Jinkun, Englot, Brendan

Oct-2-2018–arXiv.org Machine Learning

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based sparse methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

artificial intelligence, reinforcement learning, sarsa, (17 more...)

arXiv.org Machine Learning

Oct-2-2018

arXiv.org PDF

Add feedback

Country:
- Europe > Switzerland
  - Zürich > Zürich (0.14)
- North America > United States (1.00)

Genre:
- Research Report > New Finding (0.69)

Industry:
- Government > Regional Government > North America Government > United States Government (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found