Optimal Resolution of Change-Point Detection with Empirically Observed Statistics and Erasures

He, Haiyun, Zhang, Qiaosheng, Tan, Vincent Y. F.

arXiv.org Machine Learning 

This paper revisits the offline change-point detection problem from a statistical learning perspective. Instead of assuming that the underlying pre-and post-change distributions are known, it is assumed that we have partial knowledge of these distributions based on empirically observed statistics in the form of training sequences. Our problem formulation finds a variety of real-life applications from detecting when climate change occurred to detecting when a virus mutated. Using the training sequences as well as the test sequence consisting of a single-change and allowing for the erasure or rejection option, we derive the optimal resolution between the estimated and true change-points under two different asymptotic regimes on the undetected error probability--namely, the large and moderate deviations regimes. In both regimes, strong converses are also proved. In the moderate deviations case, the optimal resolution is a simple function of a symmetrized version of the chi-square distance. I. INTRODUCTION AND MOTIVATION The change-point detection (CPD) problem consists in finding changes in the underlying statistical model of data sequences that are modelled as time series. This problem has a plethora of applications in industrial systems [1], medical diagnoses [2], environmental monitoring [3], speech processing [4], finance, economics, and so on [5]. The CPD problems can be divided into two main types: offline CPD and online CPD [6]; the latter is also known as sequential CPD. This depends on whether the data sequence is fixed or obtained in a real-time setting. Offline CPD is a problem that is studied in, for example, anomaly detection problems such as detecting climate change based on existing and known statistics.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found