Compositional Reinforcement Learning from Logical Specifications Kishor Jothimurugan Suguman Bansal University of Pennsylvania University of Pennsylvania Osbert Bastani
–Neural Information Processing Systems
We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning.
Neural Information Processing Systems
May-28-2025, 21:58:17 GMT