AITopics | Roth, Volker

Going Metric: Denoising Pairwise Data

Roth, Volker, Laub, Julian, Müller, Klaus-Robert, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-2003

Pairwise data in empirical sciences typically violate metricity, either due to noise or due to fallible estimates, and therefore are hard to analyze by conventional machine learning technology. In this paper we therefore study ways to work around this problem. First, we present an alternative embedding to multidimensional scaling (MDS) that allows us to apply a variety of classical machine learning and signal processing algorithms. The class of pairwise grouping algorithms which share the shift-invariance property is statistically invariant under this embedding procedure, leading to identical assignments of objects to clusters. Based on this new vectorial representation, denoising methods are applied in a second step. Both steps provide a theoretically well controlled setup to translate from pairwise data to the respective denoised metric representation. We demonstrate the practical usefulness of our theoretical reasoning by discovering structure in protein sequence data bases, visibly improving performance upon existing automatic methods. 1 Introduction Unsupervised grouping or clustering aims at extracting hidden structure from data (see e.g.

artificial intelligence, health & medicine, sequence, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Going Metric: Denoising Pairwise Data

Roth, Volker, Laub, Julian, Müller, Klaus-Robert, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-2003

Pairwise data in empirical sciences typically violate metricity, either dueto noise or due to fallible estimates, and therefore are hard to analyze by conventional machine learning technology. In this paper we therefore study ways to work around this problem. First, we present an alternative embedding to multidimensional scaling (MDS) that allows us to apply a variety of classical machine learningand signal processing algorithms. The class of pairwise grouping algorithms which share the shift-invariance property is statistically invariant under this embedding procedure, leading to identical assignments of objects to clusters. Based on this new vectorial representation, denoising methods are applied in a second step.Both steps provide a theoretically well controlled setup to translate from pairwise data to the respective denoised metric representation.We demonstrate the practical usefulness of our theoretical reasoning by discovering structure in protein sequence data bases, visibly improving performance upon existing automatic methods. 1 Introduction Unsupervised grouping or clustering aims at extracting hidden structure from data (see e.g.

artificial intelligence, health & medicine, sequence, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.95)

Add feedback

Stability-Based Model Selection

Lange, Tilman, Braun, Mikio L., Roth, Volker, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-2003

Model selection is linked to model assessment, which is the problem of comparing different models, or model parameters, for a specific learning task. For supervised learning, the standard practical technique is crossvalidation, whichis not applicable for semi-supervised and unsupervised settings. In this paper, a new model assessment scheme is introduced which is based on a notion of stability. The stability measure yields an upper bound to cross-validation in the supervised case, but extends to semi-supervised and unsupervised problems. In the experimental part, the performance of the stability measure is studied for model order selection incomparison to standard techniques in this area.

health & medicine, oncology, stability, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.14)

Industry: Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)

Add feedback

Nonlinear Discriminant Analysis Using Kernel Functions

Roth, Volker, Steinhage, Volker

Neural Information Processing SystemsDec-31-2000

Fishers linear discriminant analysis (LDA) is a classical multivariate technique both for dimension reduction and classification. The data vectors are transformed into a low dimensional subspace such that the class centroids are spread out as much as possible. In this subspace LDA works as a simple prototype classifier with linear decision boundaries. However, in many applications the linear boundaries do not adequately separate the classes. We present a nonlinear generalization of discriminant analysis that uses the kernel trick of representing dot products by kernel functions.

artificial intelligence, discriminant analysis, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)

Add feedback

Nonlinear Discriminant Analysis Using Kernel Functions

Roth, Volker, Steinhage, Volker

Neural Information Processing SystemsDec-31-2000

Fishers linear discriminant analysis (LDA) is a classical multivariate technique both for dimension reduction and classification. The data vectors are transformed into a low dimensional subspace such that the class centroids are spread out as much as possible. In this subspace LDA works as a simple prototype classifier with linear decision boundaries. However, in many applications the linear boundaries do not adequately separate the classes. We present a nonlinear generalization of discriminant analysis that uses the kernel trick of representing dot products by kernel functions.

artificial intelligence, discriminant analysis, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)

Add feedback

Nonlinear Discriminant Analysis Using Kernel Functions

Roth, Volker, Steinhage, Volker

Neural Information Processing SystemsDec-31-2000

Fishers linear discriminant analysis (LDA) is a classical multivariate technique both for dimension reduction and classification. The data vectors are transformed into a low dimensional subspace such that the class centroids are spread out as much as possible. In this subspace LDA works as a simple prototype classifier with linear decision boundaries. However, in many applications the linear boundaries do not adequately separate the classes. We present a nonlinear generalization of discriminant analysis that uses the kernel trick of representing dot products by kernel functions.

artificial intelligence, discriminant analysis, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.98)

Add feedback