learning temporal point process
Learning Temporal Point Processes via Reinforcement Learning
Social goods, such as healthcare, smart city, and information networks, often produce ordered event data in continuous time. The generative processes of these event data can be very complex, requiring flexible models to capture their dynamics. Temporal point processes offer an elegant framework for modeling event data without discretizing the time. However, the existing maximum-likelihood-estimation (MLE) learning paradigm requires hand-crafting the intensity function beforehand and cannot directly monitor the goodness-of-fit of the estimated model in the process of training. To alleviate the risk of model-misspecification in MLE, we propose to generate samples from the generative model and monitor the quality of the samples in the process of training until the samples and the real data are indistinguishable. We take inspiration from reinforcement learning (RL) and treat the generation of each event as the action taken by a stochastic policy. We parameterize the policy as a flexible recurrent neural network and gradually improve the policy to mimic the observed event distribution. Since the reward function is unknown in this setting, we uncover an analytic and nonparametric form of the reward function using an inverse reinforcement learning formulation. This new RL framework allows us to derive an efficient policy gradient algorithm for learning flexible point process models, and we show that it performs well in both synthetic and real data.
Reviews: Learning Temporal Point Processes via Reinforcement Learning
The paper "Learning Temporal Point Processes via Reinforcement Learning" proposes a new way to lean temporal point processes, where the intensity function is defined via recurrent neural networks rather than classical parametric forms. This enables a better fitting with the true generative process. It builds on the WGAN approach but rather than dealing with a minimax optimization problem, the authors propose to use a RKHS formalization to find an analytic maximum for the expected cumulative discrepancy between processes underlying observed sequences and generated ones. The results look convincing and the followed inverse reinforcement learning approach elegant, but I am missing some justification and clarifications w.r.t. Authors claim that their way of learning allow them to define better specified models than parametric based ones.
Learning Temporal Point Processes via Reinforcement Learning
Li, Shuang, Xiao, Shuai, Zhu, Shixiang, Du, Nan, Xie, Yao, Song, Le
Social goods, such as healthcare, smart city, and information networks, often produce ordered event data in continuous time. The generative processes of these event data can be very complex, requiring flexible models to capture their dynamics. Temporal point processes offer an elegant framework for modeling event data without discretizing the time. However, the existing maximum-likelihood-estimation (MLE) learning paradigm requires hand-crafting the intensity function beforehand and cannot directly monitor the goodness-of-fit of the estimated model in the process of training. To alleviate the risk of model-misspecification in MLE, we propose to generate samples from the generative model and monitor the quality of the samples in the process of training until the samples and the real data are indistinguishable.