Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Samuel Ainsworth, Matt Barnes, Siddhartha Srinivasa

Oct-3-2025, 06:32:14 GMT–Neural Information Processing Systems

In this paper, we consider the problem of determining when along a training roll-out feedback from the environment is no longer beneficial, and an intervention such as resetting the agent to the initial state distribution is warranted. We show that such interventions can naturally trade off a small sub-optimality gap for a dramatic decrease in sample complexity. In particular, we focus on the reinforcement learning setting in which the agent has access to a reward signal in addition to either (a) an expert supervisor triggering the e-stop mechanism in real-time or (b) expert state-only demonstrations used to "learn" an automatic e-stop trigger.

algorithm, probability, reinforcement, (15 more...)

Neural Information Processing Systems

Oct-3-2025, 06:32:14 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States (0.28)
  - Canada (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning (0.95)

Duplicate Docs Excel Report

Title
Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Similar Docs Excel Report more

Title	Similarity	Source
None found