Technology
Knowledge-Based Support Vector Machine Classifiers
Fung, Glenn M., Mangasarian, Olvi L., Shavlik, Jude W.
Prior knowledge in the form of multiple polyhedral sets, each belonging toone of two categories, is introduced into a reformulation of a linear support vector machine classifier. The resulting formulation leadsto a linear program that can be solved efficiently. Real world examples, from DNA sequencing and breast cancer prognosis, demonstrate the effectiveness of the proposed method. Numerical results show improvement in test set accuracy after the incorporation ofprior knowledge into ordinary, data-based linear support vector machine classifiers. One experiment also shows that a linear classifier,based solely on prior knowledge, far outperforms the direct application of prior knowledge rules to classify data.
VIBES: A Variational Inference Engine for Bayesian Networks
Bishop, Christopher M., Spiegelhalter, David, Winn, John
In recent years variational methods have become a popular tool for approximate inference and learning in a wide variety of probabilistic models.For each new application, however, it is currently necessary first to derive the variational update equations, and then to implement them in application-specific code. Each of these steps is both time consuming and error prone. In this paper we describe a general purpose inference engine called VIBES ('Variational Inference forBayesian Networks') which allows a wide variety of probabilistic modelsto be implemented and solved variationally without recourse to coding. New models are specified either through a simple script or via a graphical interface analogous to a drawing package. VIBES then automatically generates and solves the variational equations.We illustrate the power and flexibility of VIBES using examples from Bayesian mixture modelling.
Feature Selection by Maximum Marginal Diversity
We address the question of feature selection in the context of visual recognition. It is shown that, besides efficient from a computational standpoint, the infomax principle is nearly optimal in the minimum Bayes error sense. The concept of marginal diversity is introduced, leading toa generic principle for feature selection (the principle of maximum marginal diversity) of extreme computational simplicity. The relationships betweeninfomax and the maximization of marginal diversity are identified, uncovering the existence of a family of classification procedures forwhich near optimal (in the Bayes error sense) feature selection does not require combinatorial search. Examination of this family in light of recent studies on the statistics of natural images suggests that visual recognition problems are a subset of it.
Going Metric: Denoising Pairwise Data
Roth, Volker, Laub, Julian, Müller, Klaus-Robert, Buhmann, Joachim M.
Pairwise data in empirical sciences typically violate metricity, either dueto noise or due to fallible estimates, and therefore are hard to analyze by conventional machine learning technology. In this paper we therefore study ways to work around this problem. First, we present an alternative embedding to multidimensional scaling (MDS) that allows us to apply a variety of classical machine learningand signal processing algorithms. The class of pairwise grouping algorithms which share the shift-invariance property is statistically invariant under this embedding procedure, leading to identical assignments of objects to clusters. Based on this new vectorial representation, denoising methods are applied in a second step.Both steps provide a theoretically well controlled setup to translate from pairwise data to the respective denoised metric representation.We demonstrate the practical usefulness of our theoretical reasoning by discovering structure in protein sequence data bases, visibly improving performance upon existing automatic methods. 1 Introduction Unsupervised grouping or clustering aims at extracting hidden structure from data (see e.g.
Constraint Classification for Multiclass Classification and Ranking
Har-Peled, Sariel, Roth, Dan, Zimak, Dav
We present a meta-algorithm for learning in this framework that learns via a single linear classifier in high dimension. We discuss distribution independent as well as margin-based generalization bounds and present empirical and theoretical evidence showing that constraint classification benefits over existing methods of multiclass classification.
Adaptation and Unsupervised Learning
Dayan, Peter, Sahani, Maneesh, Deback, Gregoire
Adaptation is a ubiquitous neural and psychological phenomenon, with a wealth of instantiations and implications. Although a basic form of plasticity, it has, bar some notable exceptions, attracted computational theory of only one main variety. In this paper, we study adaptation from the perspective of factor analysis, a paradigmatic technique of unsupervised learning.We use factor analysis to reinterpret a standard view of adaptation, and apply our new model to some recent data on adaptation in the domain of face discrimination.
Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
There are several reinforcement learning algorithms that yield approximate solutionsfor the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function.
Conditional Models on the Ranking Poset
Lebanon, Guy, Lafferty, John D.
A distance-based conditional model on the ranking poset is presented for use in classification and ranking. The model is an extension of the Mallows model, and generalizes the classifier combination methods used by several ensemble learning algorithms, including error correcting output codes, discrete AdaBoost, logistic regression and cranking. The algebraic structure of the ranking poset leads to a simple Bayesian interpretation ofthe conditional model and its special cases. In addition to a unifying view, the framework suggests a probabilistic interpretation for error correcting output codes and an extension beyond the binary coding scheme.
Neuromorphic Bisable VLSI Synapses with Spike-Timing-Dependent Plasticity
In these types of synapses, the short-term dynamics of the synaptic efficacies are governed by the relative timing of the pre-and post-synaptic spikes, while on long time scales the efficacies tend asymptotically to either a potentiated state or to a depressed one. We fabricated a prototype VLSI chip containing a network of integrate and fire neurons interconnected via bistable STDP synapses. Test results from this chip demonstrate the synapse's STDP learning properties, and its long-term bistable characteristics.