Reviews: Learning Temporal Point Processes via Reinforcement Learning

Neural Information Processing Systems 

The paper "Learning Temporal Point Processes via Reinforcement Learning" proposes a new way to lean temporal point processes, where the intensity function is defined via recurrent neural networks rather than classical parametric forms. This enables a better fitting with the true generative process. It builds on the WGAN approach but rather than dealing with a minimax optimization problem, the authors propose to use a RKHS formalization to find an analytic maximum for the expected cumulative discrepancy between processes underlying observed sequences and generated ones. The results look convincing and the followed inverse reinforcement learning approach elegant, but I am missing some justification and clarifications w.r.t. Authors claim that their way of learning allow them to define better specified models than parametric based ones.