Goto

Collaborating Authors

 Reinforcement Learning


Giving Feedback on Interactive Student Programs with Meta-Exploration

Neural Information Processing Systems

One approach toward automatic grading is to learn an agent that interacts with a student's program and explores states indicative of errors via reinforcement learning. However, existing work on this approach only provides binary feedback of whether a program is correct or not, while students require finer-grained feedback on the specific errors in their programs to understand their mistakes. In this work, we show that exploring to discover errors can be cast as a meta-exploration problem.









Reinforcement Learning with a Terminator Guy T ennenholtz

Neural Information Processing Systems

We present the problem of reinforcement learning with exogenous termination. We define the Termination Markov Decision Process (TerMDP), an extension of the MDP framework, in which episodes may be interrupted by an external non-Markovian observer.