When Waiting Is Not an Option: Learning Options With a Deliberation Cost

Harb, Jean (McGill University) | Bacon, Pierre-Luc (McGill University) | Klissarov, Martin (McGill University) | Precup, Doina (McGill University)

AAAI Conferences 

This perspective Temporal abstraction has a rich history in AI (Minsky 1961; helps us to formulate more precisely what objective Fikes et al. 1972; Kuipers 1979; Korf 1983; Iba 1989; criteria should be fulfilled during option construction. We Drescher 1991; Dayan and Hinton 1992; Kaelbling 1993; propose that good options are those which allow an agent to Thrun and Schwartz 1995; Parr and Russell 1998; Dietterich learn and plan faster, and provide an optimization objective 1998) and has been presented as a useful mechanism for for learning options based on this idea. We implement the a variety of problems that affect AI systems in may settings, optimization using the option-critic framework (Bacon et al. including to: generate shorter plans, speed up planning, 2017) and illustrate its usefulness with experiments in Atari improve generalization, yield better exploration, increase games.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found