Goto

Collaborating Authors

 scan b-statistic


Reproduction of scan B-statistic for kernel change-point detection algorithm

arXiv.org Machine Learning

Change-point detection has garnered significant attention due to its broad range of applications, including epidemic disease outbreaks, social network evolution, image analysis, and wireless communications. In an online setting, where new data samples arrive sequentially, it is crucial to continuously test whether these samples originate from a different distribution. Ideally, the detection algorithm should be distribution-free to ensure robustness in real-world applications. In this paper, we reproduce a recently proposed online change-point detection algorithm based on an efficient kernel-based scan B-statistic, and compare its performance with two commonly used parametric statistics. Our numerical experiments demonstrate that the scan B-statistic consistently delivers superior performance. In more challenging scenarios, parametric methods may fail to detect changes, whereas the scan B-statistic successfully identifies them in a timely manner. Additionally, the use of subsampling techniques offers a modest improvement to the original algorithm.


Online Kernel CUSUM for Change-Point Detection

arXiv.org Machine Learning

We present a computationally efficient online kernel Cumulative Sum (CUSUM) method for change-point detection that utilizes the maximum over a set of kernel statistics to account for the unknown change-point location. Our approach exhibits increased sensitivity to small changes compared to existing kernel-based change-point detection methods, including Scan-B statistic, corresponding to a non-parametric Shewhart chart-type procedure. We provide accurate analytic approximations for two key performance metrics: the Average Run Length (ARL) and Expected Detection Delay (EDD), which enable us to establish an optimal window length to be on the order of the logarithm of ARL to ensure minimal power loss relative to an oracle procedure with infinite memory. Moreover, we introduce a recursive calculation procedure for detection statistics to ensure constant computational and memory complexity, which is essential for online implementation. Through extensive experiments on both simulated and real data, we demonstrate the competitive performance of our method and validate our theoretical results.


Scan $B$-Statistic for Kernel Change-Point Detection

arXiv.org Machine Learning

Detecting the emergence of an abrupt change-point is a classic problem in statistics and machine learning. Kernel-based nonparametric statistics have been used for this task which enjoy fewer assumptions on the distributions than the parametric approach and can handle high-dimensional data. In this paper we focus on the scenario when the amount of background data is large, and propose two related computationally efficient kernel-based statistics for change-point detection, which are inspired by the recently developed $B$-statistics. A novel theoretical result of the paper is the characterization of the tail probability of these statistics using the change-of-measure technique, which focuses on characterizing the tail of the detection statistics rather than obtaining its asymptotic distribution under the null distribution. Such approximations are crucial to control the false alarm rate, which corresponds to the significance level in offline change-point detection and the average-run-length in online change-point detection. Our approximations are shown to be highly accurate. Thus, they provide a convenient way to find detection thresholds for both offline and online cases without the need to resort to the more expensive simulations or bootstrapping. We show that our methods perform well on both synthetic data and real data.