### Lagged Exact Bayesian Online Changepoint Detection

Identifying changes in the generative process of sequential data, known as changepoint detection, has become an increasingly important topic for a wide variety of fields. A recently developed approach, which we call EXact Online Bayesian Changepoint Detection (EXO), has shown reasonable results with efficient computation for real time updates. However, when the changes are relatively small, EXO starts to have difficulty in detecting changepoints accurately. We propose a new algorithm called $\ell$-Lag EXact Online Bayesian Changepoint Detection (LEXO-$\ell$), which improves the accuracy of the detection by incorporating $\ell$ time lags in the inference. We prove that LEXO-1 finds the exact posterior distribution for the current run length and can be computed efficiently, with extension to arbitrary lag. Additionally, we show that LEXO-1 performs better than EXO in an extensive simulation study; this study is extended to higher order lags to illustrate the performance of the generalized methodology. Lastly, we illustrate applicability with two real world data examples comparing EXO and LEXO-1.

### Exact Bayesian inference for off-line change-point detection in tree-structured graphical models

We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same time by combining classical dynamic programming with algebraic results pertaining to spanning trees. In particular, we show that quantities such as posterior distributions for change-points or posterior edge probabilities over time can efficiently be obtained. We illustrate our results on both synthetic and experimental data arising from biology and neuroscience.

### Stein Variational Online Changepoint Detection with Applications to Hawkes Processes and Neural Networks

Bayesian online changepoint detection (BOCPD) (Adams & MacKay, 2007) offers a rigorous and viable way to identity changepoints in complex systems. In this work, we introduce a Stein variational online changepoint detection (SVOCD) method to provide a computationally tractable generalization of BOCPD beyond the exponential family of probability distributions. We integrate the recently developed Stein variational Newton (SVN) method (Detommaso et al., 2018) and BOCPD to offer a full online Bayesian treatment for a large number of situations with significant importance in practice. We apply the resulting method to two challenging and novel applications: Hawkes processes and long short-term memory (LSTM) neural networks. In both cases, we successfully demonstrate the efficacy of our method on real data.

### Spatio-temporal Bayesian On-line Changepoint Detection with Model Selection

Bayesian On-line Changepoint Detection is extended to on-line model selection and non-stationary spatio-temporal processes. We propose spatially structured Vector Autoregressions (VARs) for modelling the process between changepoints (CPs) and give an upper bound on the approximation error of such models. The resulting algorithm performs prediction, model selection and CP detection on-line. Its time complexity is linear and its space complexity constant, and thus it is two orders of magnitudes faster than its closest competitor. In addition, it outperforms the state of the art for multivariate data.

### Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $\beta$-Divergences

We present the very first robust Bayesian Online Changepoint Detection algorithm through General Bayesian Inference (GBI) with $\beta$-divergences. The resulting inference procedure is doubly robust for both the predictive and the changepoint (CP) posterior, with linear time and constant space complexity. We provide a construction for exponential models and demonstrate it on the Bayesian Linear Regression model. In so doing, we make two additional contributions: Firstly, we make GBI scalable using Structural Variational approximations that are exact as $\beta \to 0$. Secondly, we give a principled way of choosing the divergence parameter $\beta$ by minimizing expected predictive loss on-line. Reducing False Discovery Rates of \CPs from up to 99\% to 0\% on real world data, this offers the state of the art.