Improved Switching among Temporally Abstract Actions
–Neural Information Processing Systems
In robotics and other control applications it is commonplace to have a pre(cid:173) existing set of controllers for solving subtasks, perhaps hand-crafted or previously learned or planned, and still face a difficult problem of how to choose and switch among the controllers to solve an overall task as well as possible. In this paper we present a framework based on Markov decision processes and semi-Markov decision processes for phrasing this problem, a basic theorem regarding the improvement in performance that can be ob(cid:173) tained by switching flexibly between given controllers, and example appli(cid:173) cations of the theorem. In particular, we show how an agent can plan with these high-level controllers and then use the results of such planning to find an even better plan, by modifying the existing controllers, with negligible additional cost and no re-planning. In one of our examples, the complexity of the problem is reduced from 24 billion state-action pairs to less than a million state-controller pairs. In many applications, solutions to parts of a task are known, either because they were hand(cid:173) crafted by people or because they were previously learned or planned.
Neural Information Processing Systems
Apr-6-2023, 17:33:42 GMT
- Technology: