When Waiting Is Not an Option: Learning Options With a Deliberation Cost
Harb, Jean (McGill University) | Bacon, Pierre-Luc (McGill University) | Klissarov, Martin (McGill University) | Precup, Doina (McGill University)
This perspective Temporal abstraction has a rich history in AI (Minsky 1961; helps us to formulate more precisely what objective Fikes et al. 1972; Kuipers 1979; Korf 1983; Iba 1989; criteria should be fulfilled during option construction. We Drescher 1991; Dayan and Hinton 1992; Kaelbling 1993; propose that good options are those which allow an agent to Thrun and Schwartz 1995; Parr and Russell 1998; Dietterich learn and plan faster, and provide an optimization objective 1998) and has been presented as a useful mechanism for for learning options based on this idea. We implement the a variety of problems that affect AI systems in may settings, optimization using the option-critic framework (Bacon et al. including to: generate shorter plans, speed up planning, 2017) and illustrate its usefulness with experiments in Atari improve generalization, yield better exploration, increase games.
Feb-8-2018
- Country:
- Asia > Middle East
- Republic of Türkiye (0.14)
- North America
- Canada > Quebec (0.14)
- United States
- California > San Francisco County
- San Francisco (0.14)
- Massachusetts (0.14)
- California > San Francisco County
- Asia > Middle East
- Technology: