Goto

Collaborating Authors

 empirical approximation



Response to Reviewer 1: 3

Neural Information Processing Systems

We thank all reviewers for their comments and acknowledgeme nt of our contribution. Below we address each reviewer's comments separately. The reviewer raised a very good point. We will add this clarification in the revised version. Our gradient-based method is much more efficient but only finds a stationary point.


Consistency and Regression with Laplacian regularization in Reproducing Kernel Hilbert Space

arXiv.org Machine Learning

This note explained a way to look at reproducing kernel Hilbert space for regression problems. It consists in expressing kernel regresssion solutions with simple integral operators algebra, which we can approximate consistently from empirical data, providing the corresponding estimators of the solutions. Let's consider the classical regression problem arg min ‖f(x) y‖ In practice we are going to restrict the search for a solution f F, over a simpler function space f H. Let's associate to it the canonical RKHS, see Aronszajn (1950) H It is good to find function f from X to R, but what if Y is a real Hilbert space. Indeed, it is natural to extend the theory of RKHS to vector valued functions Schwartz (1964). Once again we can build an Hilbert space of functions from X to Y, let's first define γ Those are going to be the building element of H Definition 1 (The RKHS H).


Adaptivity for Regularized Kernel Methods by Lepskii's Principle

arXiv.org Machine Learning

We address the problem of {\it adaptivity} in the framework of reproducing kernel Hilbert space (RKHS) regression. More precisely, we analyze estimators arising from a linear regularization scheme $g_\lam$. In practical applications, an important task is to choose the regularization parameter $\lam$ appropriately, i.e. based only on the given data and independently on unknown structural assumptions on the regression function. An attractive approach avoiding data-splitting is the {\it Lepskii Principle} (LP), also known as the {\it Balancing Principle} is this setting. We show that a modified parameter choice based on (LP) is minimax optimal adaptive, up to $\log\log(n)$. A convenient result is the fact that balancing in $L^2(\nu)-$ norm, which is easiest, automatically gives optimal balancing in all stronger norms, interpolating between $L^2(\nu)$ and the RKHS. An analogous result is open for other classical approaches to data dependent choices of the regularization parameter, e.g. for Hold-Out.