Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking

Du, Desong, Qi, Naiming, Liu, Yanfang, Pan, Wei

Nov-6-2023–arXiv.org Artificial Intelligence

In the pursuit of autonomous spacecraft proximity maneuvers and docking(PMD), we introduce a novel Bayesian actor-critic reinforcement learning algorithm to learn a control policy with the stability guarantee. The PMD task is formulated as a Markov decision process that reflects the relative dynamic model, the docking cone and the cost function. Drawing from the principles of Lyapunov theory, we frame the temporal difference learning as a constrained Gaussian process regression problem. This innovative approach allows the state-value function to be expressed as a Lyapunov function, leveraging the Gaussian process and deep kernel learning. We develop a novel Bayesian quadrature policy optimization procedure to analytically compute the policy gradient while integrating Lyapunov-based stability constraints. This integration is pivotal in satisfying the rigorous safety demands of spaceflight missions. The proposed algorithm has been experimentally evaluated on a spacecraft air-bearing testbed and shows impressive and promising performance.

algorithm, gaussian process, lyapunov function, (13 more...)

arXiv.org Artificial Intelligence

Nov-6-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Netherlands
  - South Holland > Delft (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Heilongjiang Province
    - Harbin (0.04)

Genre:
- Research Report (0.70)
- Overview (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.64)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found