Statistical Learning
A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification
Grandvalet, Yves, Mariethoz, Johnny, Bengio, Samy
In this paper, we show that the hinge loss can be interpreted as the neg-log-likelihood of a semi-parametric model of posterior probabilities. From this point of view, SVMs represent the parametric component of a semi-parametric model fitted by a maximum a posteriori estimation procedure. Thisconnection enables to derive a mapping from SVM scores to estimated posterior probabilities. Unlike previous proposals, the suggested mappingis interval-valued, providing a set of posterior probabilities compatible with each SVM score. This framework offers a new way to adapt the SVM optimization problem to unbalanced classification, whendecisions result in unequal (asymmetric) losses. Experiments show improvements over state-of-the-art procedures.
Query by Committee Made Real
Gilad-bachrach, Ran, Navot, Amir, Tishby, Naftali
Training a learning algorithm is a costly task. A major goal of active learning is to reduce this cost. In this paper we introduce a new algorithm, KQBC,which is capable of actively learning large scale problems by using selective sampling. The algorithm overcomes the costly sampling stepof the well known Query By Committee (QBC) algorithm by projecting onto a low dimensional space. KQBC also enables the use of kernels, providing a simple way of extending QBC to the nonlinear scenario. Sampling the low dimension space is done using the hit and run random walk. We demonstrate the success of this novel algorithm by applying it to both artificial and a real world problems.
Hierarchical Linear/Constant Time SLAM Using Particle Filters for Dense Maps
Eliazar, Austin I., Parr, Ronald
We present an improvement to the DP-SLAM algorithm for simultaneous localizationand mapping (SLAM) that maintains multiple hypotheses about densely populated maps (one full map per particle in a particle filter)in time that is linear in all significant algorithm parameters and takes constant (amortized) time per iteration. This means that the asymptotic complexity of the algorithm is no greater than that of a pure localization algorithm using a single map and the same number of particles. Wealso present a hierarchical extension of DP-SLAM that uses a two level particle filter which models drift in the particle filtering process itself. The hierarchical approach enables recovery from the inevitable drift that results from using a finite number of particles in a particle filter and permits the use of DP-SLAM in more challenging domains, while maintaining linear time asymptotic complexity.
Data-Driven Online to Batch Conversions
Online learning algorithms are typically fast, memory efficient, and simple toimplement. However, many common learning problems fit more naturally in the batch learning setting. The power of online learning algorithms can be exploited in batch settings by using online-to-batch conversions techniques which build a new batch algorithm from an existing onlinealgorithm. We first give a unified overview of three existing online-to-batch conversion techniques which do not use training data in the conversion process. We then build upon these data-independent conversions to derive and analyze data-driven conversions.
The Forgetron: A Kernel-Based Perceptron on a Fixed Budget
Dekel, Ofer, Shalev-shwartz, Shai, Singer, Yoram
The Perceptron algorithm, despite its simplicity, often performs well on online classification tasks. The Perceptron becomes especially effective when it is used in conjunction with kernels. However, a common difficulty encounteredwhen implementing kernel-based online algorithms is the amount of memory required to store the online hypothesis, which may grow unboundedly. In this paper we present and analyze the Forgetron algorithmfor kernel-based online learning on a fixed memory budget. To our knowledge, this is the first online learning algorithm which, on one hand, maintains a strict limit on the number of examples it stores while, on the other hand, entertains a relative mistake bound. In addition to the formal results, we also present experiments with real datasets which underscore the merits of our approach.
Fast Krylov Methods for N-Body Learning
Freitas, Nando D., Wang, Yang, Mahdaviani, Maryam, Lang, Dustin
This paper addresses the issue of numerical computation in machine learning domains based on similarity metrics, such as kernel methods, spectral techniques and Gaussian processes. It presents a general solution strategybased on Krylov subspace iteration and fast N-body learning methods. The experiments show significant gains in computation and storage on datasets arising in image segmentation, object detection and dimensionality reduction. The paper also presents theoretical bounds on the stability of these methods.