Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Matteo Turchetta, Felix Berkenkamp, Andreas Krause

Jan-20-2025, 16:25:19 GMT–Neural Information Processing Systems

In classical reinforcement learning agents accept arbitrary short term loss for long term gain when exploring their environment. This is infeasible for safety critical applications such as robotics, where even a single unsafe action may cause system failure or harm the environment. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an a priori unknown safety constraint that depends on states and actions and satisfies certain regularity conditions expressed via a Gaussian process prior.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Jan-20-2025, 16:25:19 GMT

Conferences PDF

Add feedback

Country:
- Europe (0.46)
- North America > United States (0.28)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)