Faster Rates in Regression via Active Learning

Neural Information Processing Systems

This paper presents a rigorous statistical analysis characterizing regimes in which active learning significantly outperforms classical passive learning. Activelearning algorithms are able to make queries or select sample locations in an online fashion, depending on the results of the previous queries. In some regimes, this extra flexibility leads to significantly faster rates of error decay than those possible in classical passive learning settings. Thenature of these regimes is explored by studying fundamental performance limits of active and passive learning in two illustrative nonparametric function classes. In addition to examining the theoretical potentialof active learning, this paper describes a practical algorithm capable of exploiting the extra flexibility of the active setting and provably improvingupon the classical passive techniques. Our active learning theory and methods show promise in a number of applications, including field estimation using wireless sensor networks and fault line detection.


Controlled Random Search Improves Sample Mining and Hyper-Parameter Optimization

arXiv.org Machine Learning

A common challenge in machine learning and related fields is the need to efficiently explore high dimensional parameter spaces using small numbers of samples. Typical examples are hyper-parameter optimization in deep learning and sample mining in predictive modeling tasks. All such problems trade-off exploration, which samples the space without knowledge of the target function, and exploitation where information from previous evaluations is used in an adaptive feedback loop. Much of the recent focus has been on the exploitation while exploration is done with simple designs such as Latin hypercube or even uniform random sampling. In this paper, we introduce optimal space-filling sample designs for effective exploration of high dimensional spaces. Specifically, we propose a new parameterized family of sample designs called space-filling spectral designs, and introduce a framework to choose optimal designs for a given sample size and dimension. Furthermore, we present an efficient algorithm to synthesize a given spectral design. Finally, we evaluate the performance of spectral designs in both data space and model space applications. The data space exploration is targeted at recovering complex regression functions in high dimensional spaces. The model space exploration focuses on selecting hyper-parameters for a given neural network architecture. Our empirical studies demonstrate that the proposed approach consistently outperforms state-of-the-art techniques, particularly with smaller design sizes.


An innovative adaptive kriging approach for efficient binary classification of mechanical problems

arXiv.org Machine Learning

Kriging is an efficient machine-learning tool, which allows to obtain an approximate response of an investigated phenomenon on the whole parametric space. Adaptive schemes provide a the ability to guide the experiment yielding new sample point positions to enrich the metamodel. Herein a novel adaptive scheme called Monte Carlo-intersite Voronoi (MiVor) is proposed to efficiently identify binary decision regions on the basis of a regression surrogate model. The performance of the innovative approach is tested for analytical functions as well as some mechanical problems and is furthermore compared to two regression-based adaptive schemes. For smooth problems, all three methods have comparable performances. For highly fluctuating response surface as encountered e.g. for dynamics or damage problems, the innovative MiVor algorithm performs very well and provides accurate binary classification with only a few observation points.


Active Learning with Distributional Estimates

arXiv.org Machine Learning

Active Learning (AL) is increasingly important in a broad range of applications. Two main AL principles to obtain accurate classification with few labeled data are refinement of the current decision boundary and exploration of poorly sampled regions. In this paper we derive a novel AL scheme that balances these two principles in a natural way. In contrast to many AL strategies, which are based on an estimated class conditional probability ^p(y|x), a key component of our approach is to view this quantity as a random variable, hence explicitly considering the uncertainty in its estimated value. Our main contribution is a novel mathematical framework for uncertainty-based AL, and a corresponding AL scheme, where the uncertainty in ^p(y|x) is modeled by a second-order distribution. On the practical side, we show how to approximate such second-order distributions for kernel density classification. Finally, we find that over a large number of UCI, USPS and Caltech4 datasets, our AL scheme achieves significantly better learning curves than popular AL methods such as uncertainty sampling and error reduction sampling, when all use the same kernel density classifier.


Function Estimation via Reconstruction

arXiv.org Machine Learning

This paper introduces an interpolation-based method, called the reconstruction approach, for function estimation in nonparametric models. Based on the fact that interpolation usually has negligible errors compared to statistical estimation, the reconstruction approach uses an interpolator to parameterize the unknown function with its values at finite knots, and then estimates these values by minimizing a regularized empirical risk function. Some popular methods including kernel ridge regression and kernel support vector machines can be viewed as its special cases. It is shown that, the reconstruction idea not only provides different angles to look into existing methods, but also produces new effective experimental design and estimation methods for nonparametric models.