Sequential Adaptive Design for Jump Regression Estimation

Park, Chiwoo, Qiu, Peihua

arXiv.org Machine Learning 

Selecting input data or design points for statistical models has been of great interest in sequential design and active learning. In this paper, we present a new strategy of selecting the design points for a regression model when the underlying regression function is discontinuous. Two main motivating examples are (1) compressed material imaging with the purpose of accelerating the imaging speed and (2) design for regression analysis over a phase diagram in chemistry. In both examples, the underlying regression functions have discontinuities, so many of the existing design optimization approaches cannot be applied for the two examples because they mostly assume a continuous regression function. There are some studies for estimating a discontinuous regression function from its noisy observations, but all noisy observations are typically provided in advance in these studies. In this paper, we develop a design strategy of selecting the design points for regression analysis with discontinuities. We first review the existing approaches relevant to design optimization and active learning for regression analysis and discuss their limitations in handling a discontinuous regression function. We then present our novel design strategy for a regression analysis with discontinuities: some statistical properties with a fixed design will be presented first, and then these properties will be used to propose a new criterion of selecting the design points for the regression analysis. Sequential design of experiments with the new criterion will be presented with numerical examples.