composable specification language
A Composable Specification Language for Reinforcement Learning Tasks
Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
Reviews: A Composable Specification Language for Reinforcement Learning Tasks
The specification language seems to be similar to past work, being a restricted form of temporal logic. The atomic predicates comes in two flavours: ("eventually") achieve certain state or ("always") ensuring to avoid certain states. Various composition of these atomic predicates can be used (A then B, A or B, etc.). The paper's proposed finite state machine "task monitor" bears resemblance to the FSM "reward machines" proposed by Icarte et al. [1], which was not cited/discussed. So I will be quite interested how the authours clarify its differences to the Reward Machines.
Reviews: A Composable Specification Language for Reinforcement Learning Tasks
The paper presents and evaluates SPECTRL, a framework for transforming formal specifications of tasks into shaped reward function. The reviewers agreed that, while it is not obvious that this paper will be extremely impactful, it is nonetheless interesting, convincing, and clearly written. After some discussion, the consensus leans towards acceptance, although with some outstanding issues (especially regarding the cartpole results) which should be addressed before publication. It is also highly recommended that a reference implementation of this method be released for use within the community, although it is not in my power to make this a formal requirement for publication.
A Composable Specification Language for Reinforcement Learning Tasks
Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
A Composable Specification Language for Reinforcement Learning Tasks
Jothimurugan, Kishor, Alur, Rajeev, Bastani, Osbert
Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines. Papers published at the Neural Information Processing Systems Conference.