Learning Propositional Functions for Planning and Reinforcement Learning
Hershkowitz, David Ellis (Brown University) | MacGlashan, James (Brown University) | Tellex, Stefanie (Brown University)
Massive state spaces are ubiquitous throughout planning and reinforcement learning (RL) domains: agents involved in furniture assembly, cooking automation and backgammon must grapple with problem formalisms that are much too expansive to solve by conventional tabular approaches. However, modern tabular planning and RL techniques bypass this difficulty by using propositional functions to transfer knowledge across states — both within and across problem instances — to solve for near optimal behaviors in very large state spaces. Here we present a means by which useful propositional functions can be inferred from observations of transition dynamics. Our approach is based upon distilling salient relational values between pairs of objects. We then use these learned propositional functions to free the RL algorithm deterministic object-oriented RMAX (DOORMAX) of its dependence on expert-provided propositional functions. We also empirically demonstrate high correspondence between these learned propositional functions and expert-provided propositional functions. Our novel DOORMAX algorithm performs at a level near that of classic DOORMAX.
Nov-1-2015