In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sampling space. However, in many anomaly quantification applications, the main goal is to estimate sparse anomalous regions in the pixel-level accurately. In this paper, we develop a novel framework named Adaptive Kernelized Maximum-Minimum Distance AKM$^2$D to speed up the inspection and anomaly detection process through an intelligent sequential sampling scheme integrated with fast estimation and detection. The proposed method balances the sampling efforts between the space-filling sampling (exploration) and focused sampling near the anomalous region (exploitation). The proposed methodology is validated by conducting simulations and a case study of anomaly detection in composite sheets using a guided wave test.
This paper presents a rigorous statistical analysis characterizing regimes in which active learning significantly outperforms classical passive learning. Activelearning algorithms are able to make queries or select sample locations in an online fashion, depending on the results of the previous queries. In some regimes, this extra flexibility leads to significantly faster rates of error decay than those possible in classical passive learning settings. Thenature of these regimes is explored by studying fundamental performance limits of active and passive learning in two illustrative nonparametric function classes. In addition to examining the theoretical potentialof active learning, this paper describes a practical algorithm capable of exploiting the extra flexibility of the active setting and provably improvingupon the classical passive techniques. Our active learning theory and methods show promise in a number of applications, including field estimation using wireless sensor networks and fault line detection.
A common challenge in machine learning and related fields is the need to efficiently explore high dimensional parameter spaces using small numbers of samples. Typical examples are hyper-parameter optimization in deep learning and sample mining in predictive modeling tasks. All such problems trade-off exploration, which samples the space without knowledge of the target function, and exploitation where information from previous evaluations is used in an adaptive feedback loop. Much of the recent focus has been on the exploitation while exploration is done with simple designs such as Latin hypercube or even uniform random sampling. In this paper, we introduce optimal space-filling sample designs for effective exploration of high dimensional spaces. Specifically, we propose a new parameterized family of sample designs called space-filling spectral designs, and introduce a framework to choose optimal designs for a given sample size and dimension. Furthermore, we present an efficient algorithm to synthesize a given spectral design. Finally, we evaluate the performance of spectral designs in both data space and model space applications. The data space exploration is targeted at recovering complex regression functions in high dimensional spaces. The model space exploration focuses on selecting hyper-parameters for a given neural network architecture. Our empirical studies demonstrate that the proposed approach consistently outperforms state-of-the-art techniques, particularly with smaller design sizes.
Active Learning (AL) is increasingly important in a broad range of applications. Two main AL principles to obtain accurate classification with few labeled data are refinement of the current decision boundary and exploration of poorly sampled regions. In this paper we derive a novel AL scheme that balances these two principles in a natural way. In contrast to many AL strategies, which are based on an estimated class conditional probability ^p(y|x), a key component of our approach is to view this quantity as a random variable, hence explicitly considering the uncertainty in its estimated value. Our main contribution is a novel mathematical framework for uncertainty-based AL, and a corresponding AL scheme, where the uncertainty in ^p(y|x) is modeled by a second-order distribution. On the practical side, we show how to approximate such second-order distributions for kernel density classification. Finally, we find that over a large number of UCI, USPS and Caltech4 datasets, our AL scheme achieves significantly better learning curves than popular AL methods such as uncertainty sampling and error reduction sampling, when all use the same kernel density classifier.
Kriging is an efficient machine-learning tool, which allows to obtain an approximate response of an investigated phenomenon on the whole parametric space. Adaptive schemes provide a the ability to guide the experiment yielding new sample point positions to enrich the metamodel. Herein a novel adaptive scheme called Monte Carlo-intersite Voronoi (MiVor) is proposed to efficiently identify binary decision regions on the basis of a regression surrogate model. The performance of the innovative approach is tested for analytical functions as well as some mechanical problems and is furthermore compared to two regression-based adaptive schemes. For smooth problems, all three methods have comparable performances. For highly fluctuating response surface as encountered e.g. for dynamics or damage problems, the innovative MiVor algorithm performs very well and provides accurate binary classification with only a few observation points.