Goto

Collaborating Authors

 Christmann, Andreas


Qualitative Robustness of Support Vector Machines

arXiv.org Machine Learning

Support vector machines have attracted much attention in theoretical and in applied statistics. Main topics of recent interest are consistency, learning rates and robustness. In this article, it is shown that support vector machines are qualitatively robust. Since support vector machines can be represented by a functional on the set of all probability measures, qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of support vector machines, our results show that support vector machines are the solutions of a well-posed mathematical problem in Hadamard's sense.


Universal Kernels on Non-Standard Input Spaces

Neural Information Processing Systems

During the last years support vector machines (SVMs) have been successfully applied even in situations where the input space $X$ is not necessarily a subset of $R^d$. Examples include SVMs using probability measures to analyse e.g. histograms or coloured images, SVMs for text classification and web mining, and SVMs for applications from computational biology using, e.g., kernels for trees and graphs. Moreover, SVMs are known to be consistent to the Bayes risk, if either the input space is a complete separable metric space and the reproducing kernel Hilbert space (RKHS) $H\subset L_p(P_X)$ is dense, or if the SVM is based on a universal kernel $k$. So far, however, there are no RKHSs of practical interest known that satisfy these assumptions on $\cH$ or $k$ if $X \not\subset R^d$. We close this gap by providing a general technique based on Taylor-type kernels to explicitly construct universal kernels on compact metric spaces which are not subset of $R^d$. We apply this technique for the following special cases: universal kernels on the set of probability measures, universal kernels based on Fourier transforms, and universal kernels for signal processing.


Sparsity of SVMs that use the epsilon-insensitive loss

Neural Information Processing Systems

In this paper lower and upper bounds for the number of support vectors are derived for support vector machines (SVMs) based on the epsilon-insensitive loss function. It turns out that these bounds are asymptotically tight under mild assumptions on the data generating distribution. Finally, we briefly discuss a trade-off in epsilon between sparsity and accuracy if the SVM is used to estimate the conditional median.


Fast Learning from Non-i.i.d. Observations

Neural Information Processing Systems

We prove an oracle inequality for generic regularized empirical risk minimization algorithms learning from $\a$-mixing processes. To illustrate this oracle inequality, we use it to derive learning rates for some learning methods including least squares SVMs. Since the proof of the oracle inequality uses recent localization ideas developed for independent and identically distributed (i.i.d.) processes, it turns out that these learning rates are close to the optimal rates known in the i.i.d. case.