Reinforcement Learning for Mixed Open-loop and Closed-loop Control
Hansen, Eric A., Barto, Andrew G., Zilberstein, Shlomo
–Neural Information Processing Systems
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be costeffective to take sequences of actions in open-loop mode. We describe a reinforcement learning algorithm that learns to combine open-loop and closed-loop control when sensing incurs a cost. Although we assume reliable sensors, use of open-loop control means that actions must sometimes be taken when the current state of the controlled system is uncertain. This is a special case of the hidden-state problem in reinforcement learning, and to cope, our algorithm relies on short-term memory. The main result of the paper is a rule that significantly limits exploration of possible memory states by pruning memory states for which the estimated value of information is greater than its cost. We prove that this rule allows convergence to an optimal policy.
Neural Information Processing Systems
Dec-31-1997
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America > United States
- Massachusetts > Hampshire County > Amherst (0.14)
- Europe > United Kingdom
- Technology: