Robust Reinforcement Learning in Motion Planning

Singh, Satinder P., Barto, Andrew G., Grupen, Roderic, Connolly, Christopher

Dec-31-1994–Neural Information Processing Systems

While exploring to find better solutions, an agent performing online reinforcementlearning (RL) can perform worse than is acceptable. Insome cases, exploration might have unsafe, or even catastrophic, results,often modeled in terms of reaching'failure' states of the agent's environment. This paper presents a method that uses domain knowledge to reduce the number of failures during exploration. Thismethod formulates the set of actions from which the RL agent composes a control policy to ensure that exploration is conducted in a policy space that excludes most of the unacceptable policies. The resulting action set has a more abstract relationship to the task being solved than is common in many applications of RL. Although the cost of this added safety is that learning may result in a suboptimal solution, we argue that this is an appropriate tradeoffin many problems. We illustrate this method in the domain of motion planning. "'This work was done while the first author was finishing his Ph.D in computer science at the University of Massachusetts, Amherst.

artificial intelligence, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Dec-31-1994

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
Robust Reinforcement Learning in Motion Planning
Robust Reinforcement Learning in Motion Planning

Similar Docs Excel Report more

Title	Similarity	Source
None found