Performance Analysis
Robust Novelty Detection with Single-Class MPM
Ghaoui, Laurent E., Jordan, Michael I., Lanckriet, Gert R.
This algorithm-the "single-class minimax probability machine (MPM)"- is built on a distribution-free methodology that minimizes the worst-case probability of a data point falling outside of a convex set, given only the mean and covariance matrix of the distribution and making no further distributional assumptions. We present a robust approach to estimating the mean and covariance matrix within the general two-class MPM setting, and show how this approach specializes to the single-class problem. We provide empirical results comparing the single-class MPM to the single-class SVM and a two-class SVM method. 1 Introduction Novelty detection is an important unsupervised learning problem in which test data are to be judged as having been generated from the same or a different process as that which generated the training data.
Rational Kernels
Cortes, Corinna, Haffner, Patrick, Mohri, Mehryar
We introduce a general family of kernels based on weighted transducers or rational relations, rational kernels, that can be used for analysis of variable-length sequences or more generally weighted automata, in applications such as computational biology or speech recognition. We show that rational kernels can be computed efficiently using a general algorithm of composition of weighted transducers and a general single-source shortest-distance algorithm. We also describe several general families of positive definite symmetric rational kernels. These general kernels can be combined with Support Vector Machines to form efficient and powerful techniques for spoken-dialog classification: highly complex kernels become easy to design and implement and lead to substantial improvements in the classification accuracy. We also show that the string kernels considered in applications to computational biology are all specific instances of rational kernels.
Margin-Based Algorithms for Information Filtering
Cesa-bianchi, Nicolรฒ, Conconi, Alex, Gentile, Claudio
In this work, we study an information filtering model where the relevance labels associated to a sequence of feature vectors are realizations of an unknown probabilistic linear function. Building on the analysis of a restricted version of our model, we derive a general filtering rule based on the margin of a ridge regression estimator. While our rule may observe the label of a vector only by classfying the vector as relevant, experiments on a real-world document filtering problem show that the performance of our rule is close to that of the online classifier which is allowed to observe all labels. These empirical results are complemented by a theoretical analysis where we consider a randomized variant of our rule and prove that its expected number of mistakes is never much larger than that of the optimal filtering rule which knows the hidden linear model.
Margin-Based Algorithms for Information Filtering
Cesa-bianchi, Nicolรฒ, Conconi, Alex, Gentile, Claudio
In this work, we study an information filtering model where the relevance labels associated to a sequence of feature vectors are realizations of an unknown probabilistic linear function. Building on the analysis of a restricted version of our model, we derive a general filtering rule based on the margin of a ridge regression estimator. While our rule may observe the label of a vector only by classfying the vector as relevant, experiments on a real-world document filtering problem show that the performance of our rule is close to that of the online classifier which is allowed to observe all labels. These empirical results are complemented by a theoretical analysis where we consider a randomized variant of our rule and prove that its expected number of mistakes is never much larger than that of the optimal filtering rule which knows the hidden linear model.
Using Manifold Stucture for Partially Labeled Classification
Belkin, Mikhail, Niyogi, Partha
We consider the general problem of utilizing both labeled and unlabeled datato improve classification accuracy. Under t he assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. The central idea of our approach is that classification functions are naturally defined only on t he submanifold inquestion rather than the total ambient space. Using the Laplace Beltrami operator one produces a basis for a Hilbert space of square integrable functions on the submanifold. To recover such a basis, only unlabeled examples are required. Once a basis is obtained, trainingcan be performed using the labeled data set.
Learning to Classify Galaxy Shapes Using the EM Algorithm
Kirshner, Sergey, Cadez, Igor V., Smyth, Padhraic, Kamath, Chandrika
We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this "ordering-model" with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classiยฃcation of galaxies in cross-validation experiments.
Improving a Page Classifier with Anchor Extraction and Link Analysis
Most text categorization systems use simple models of documents and document collections. In this paper we describe a technique that improves asimple web page classifier's performance on pages from a new, unseen web site, by exploiting link structure within a site as well as page structure within hub pages. On real-world test cases, this technique significantly and substantially improves the accuracy of a bag-of-words classifier, reducing error rate by about half, on average. The system uses a variant of co-training to exploit unlabeled data from a new site. Pages are labeled using the base classifier; the results are used by a restricted wrapper-learner to propose potential "main-category anchor wrappers"; and finally, these wrappers are used as features by a third learner to find a categorization of the site that implies a simple hub structure, but which also largely agrees with the original bag-of-words classifier.
The RA Scanner: Prediction of Rheumatoid Joint Inflammation Based on Laser Imaging
Schwaighofer, Anton, Tresp, Volker, Mayer, Peter, Scheel, Alexander K., Mรผller, Gerhard A.
We describe the RA scanner, a novel system for the examination of patients sufferingfrom rheumatoid arthritis. The RA scanner is based on a novel laser-based imaging technique which is sensitive to the optical characteristics of finger joint tissue. Based on the laser images, finger joints are classified according to whether the inflammatory status has improved or worsened. To perform the classification task, various linear andkernel-based systems were implemented and their performances were compared. Special emphasis was put on measures to reliably perform parametertuning and evaluation, since only a very small data set was available. Based on the results presented in this paper, it was concluded thatthe RA scanner permits a reliable classification of pathological finger joints, thus paving the way for a further development from prototype to product stage.
Learning to Detect Natural Image Boundaries Using Brightness and Texture
Martin, David R., Fowlkes, Charless C., Malik, Jitendra
The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, a classifier is trained using human labeled images as ground truth. We present precision-recall curves showing that the resulting detector outperforms existing approaches.
FloatBoost Learning for Classification
Li, Stan Z., Zhang, Zhenqiu, Shum, Heung-yeung, Zhang, Hongjiang
AdaBoost [3] minimizes an upper error bound which is an exponential function of the margin on the training set [14]. However, the ultimate goal in applications of pattern classification is always minimum error rate. On the other hand, AdaBoost needs an effective procedure for learning weak classifiers, which by itself is difficult especially for high dimensional data. In this paper, we present a novel procedure, called FloatBoost, for learning a better boosted classifier. FloatBoost uses a backtrack mechanism after each iteration of AdaBoost to remove weak classifiers which cause higher error rates. The resulting float-boosted classifier consists of fewer weak classifiers yet achieves lower error rates than AdaBoost in both training and test. We also propose a statistical model for learning weak classifiers, based on a stagewise approximation of the posterior using an overcomplete set of scalar features. Experimental comparisonsof FloatBoost and AdaBoost are provided through a difficult classification problem, face detection, where the goal is to learn from training examples a highly nonlinear classifier to differentiate between faceand nonface patterns in a high dimensional space. The results clearly demonstrate the promises made by FloatBoost over AdaBoost.