Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation
Martin, John, Wang, Jinkun, Englot, Brendan
We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based sparse methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.
Oct-2-2018
- Country:
- Europe > Switzerland
- North America > United States (1.00)
- Genre:
- Research Report > New Finding (0.69)
- Industry:
- Technology: