Müller, Klaus-Robert
Unmixing Hyperspectral Data
Parra, Lucas C., Spence, Clay, Sajda, Paul, Ziehe, Andreas, Müller, Klaus-Robert
In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. Weassume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristicsare not know a priori, we face the problem of unsupervised linear unmixing.
Invariant Feature Extraction and Classification in Kernel Spaces
Mika, Sebastian, Rätsch, Gunnar, Weston, Jason, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert
In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. Weassume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristicsare not know a priori, we face the problem of unsupervised linear unmixing.
v-Arc: Ensemble Learning in the Presence of Outliers
Rätsch, Gunnar, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert, Onoda, Takashi, Mika, Sebastian
The idea of a large minimum margin [17] explains the good generalization performance of AdaBoost in the low noise regime. However, AdaBoost performs worse on noisy tasks [10, 11], such as the iris and the breast cancer benchmark data sets [1]. On the latter tasks, a large margin on all training points cannot be achieved without adverse effects on the generalization error. This experimental observation was supported by the study of [13] where the generalization error of ensemble methods was bounded by the sum of the fraction of training points which have a margin smaller than some value p, say, plus a complexity term depending on the base hypotheses and p. While this bound can only capture part of what is going on in practice, it nevertheless already conveys the message that in some cases it pays to allow for some points which have a small margin, or are misclassified, if this leads to a larger overall margin on the remaining points. To cope with this problem, it was mandatory to construct regularized variants of AdaBoost, which traded off the number of margin errors and the size of the margin 562 G. Riitsch, B. Sch6lkopf, A. J. Smola, K.-R.
Unmixing Hyperspectral Data
Parra, Lucas C., Spence, Clay, Sajda, Paul, Ziehe, Andreas, Müller, Klaus-Robert
In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. We assume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristics are not know a priori, we face the problem of unsupervised linear unmixing.
Invariant Feature Extraction and Classification in Kernel Spaces
Mika, Sebastian, Rätsch, Gunnar, Weston, Jason, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert
In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials, where the mixture coefficients correspond to the abundances of the constituting materials. We assume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristics are not know a priori, we face the problem of unsupervised linear unmixing.
v-Arc: Ensemble Learning in the Presence of Outliers
Rätsch, Gunnar, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert, Onoda, Takashi, Mika, Sebastian
The idea of a large minimum margin [17] explains the good generalization performance ofAdaBoost in the low noise regime. However, AdaBoost performs worse on noisy tasks [10, 11], such as the iris and the breast cancer benchmark data sets [1]. On the latter tasks, a large margin on all training points cannot be achieved without adverse effects on the generalization error. This experimental observation was supported by the study of [13] where the generalization error of ensemble methods wasbounded by the sum of the fraction of training points which have a margin smaller than some value p, say, plus a complexity term depending on the base hypotheses andp. While this bound can only capture part of what is going on in practice, it nevertheless already conveys the message that in some cases it pays to allow for some points which have a small margin, or are misclassified, if this leads to a larger overall margin on the remaining points. To cope with this problem, it was mandatory to construct regularized variants of AdaBoost, which traded off the number of margin errors and the size of the margin 562 G.Riitsch, B. Sch6lkopf, A. J. Smola, K.-R.
Kernel PCA and De-Noising in Feature Spaces
Mika, Sebastian, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert, Scholz, Matthias, Rätsch, Gunnar
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered asa natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCAlive in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images,focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.
Kernel PCA and De-Noising in Feature Spaces
Mika, Sebastian, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert, Scholz, Matthias, Rätsch, Gunnar
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCA live in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.
Analysis of Drifting Dynamics with Neural Network Hidden Markov Models
Kohlmorgen, Jens, Müller, Klaus-Robert, Pawelzik, Klaus
We present a method for the analysis of nonstationary time series with multiple operating modes. In particular, it is possible to detect and to model both a switching of the dynamics and a less abrupt, time consuming drift from one mode to another. This is achieved in two steps. First, an unsupervised training method provides prediction experts for the inherent dynamical modes. Then, the trained experts are used in a hidden Markov model that allows to model drifts. An application to physiological wake/sleep data demonstrates that analysis and modeling of real-world time series can be improved when the drift paradigm is taken into account.
Analysis of Drifting Dynamics with Neural Network Hidden Markov Models
Kohlmorgen, Jens, Müller, Klaus-Robert, Pawelzik, Klaus
We present a method for the analysis of nonstationary time series with multiple operating modes. In particular, it is possible to detect and to model both a switching of the dynamics and a less abrupt, time consuming drift from one mode to another. This is achieved in two steps. First, an unsupervised training method provides prediction experts for the inherent dynamical modes. Then, the trained experts are used in a hidden Markov model that allows to model drifts. An application to physiological wake/sleep data demonstrates that analysis and modeling of real-world time series can be improved when the drift paradigm is taken into account.