Goto

Collaborating Authors

 Information Technology



Margin Maximizing Loss Functions

Neural Information Processing Systems

Margin maximizing properties play an important role in the analysis of classiยฃcation models,such as boosting and support vector machines. Margin maximization is theoretically interesting because it facilitates generalization error analysis, and practically interesting because it presents a clear geometric interpretation of the models being built. We formulate and prove a sufยฃcient condition for the solutions of regularized loss functions to converge to margin maximizing separators, asthe regularization vanishes. This condition covers the hinge loss of SVM, the exponential loss of AdaBoost and logistic regression loss. We also generalize it to multi-class classiยฃcation problems, and present margin maximizing multiclass versionsof logistic regression and support vector machines.


Modeling User Rating Profiles For Collaborative Filtering

Neural Information Processing Systems

In this paper we present a generative latent variable model for rating-based collaborative filtering called the User Rating Profile model (URP). The generative process which underlies URP is designed toproduce complete user rating profiles, an assignment of one rating to each item for each user. Our model represents each user as a mixture of user attitudes, and the mixing proportions are distributed according to a Dirichlet random variable. The rating for each item is generated by selecting a user attitude for the item, and then selecting a rating according to the preference pattern associated withthat attitude. URP is related to several models including a multinomial mixture model, the aspect model [7], and LDA [1], but has clear advantages over each.


Perspectives on Sparse Bayesian Learning

Neural Information Processing Systems

Recently, relevance vector machines (RVM) have been fashioned from a sparse Bayesian learning (SBL) framework to perform supervised learning usinga weight prior that encourages sparsity of representation. The methodology incorporates an additional set of hyperparameters governing theprior, one for each weight, and then adopts a specific approximation tothe full marginalization over all weights and hyperparameters. Despite its empirical success however, no rigorous motivation for this particular approximation is currently available. To address this issue, we demonstrate that SBL can be recast as the application of a rigorous variational approximationto the full model by expressing the prior in a dual form. This formulation obviates the necessity of assuming any hyperpriors andleads to natural, intuitive explanations of why sparsity is achieved in practice.


Probability Estimates for Multi-Class Classification by Pairwise Coupling

Neural Information Processing Systems

Pairwise coupling is a popular multi-class classification method that combines together all pairwise comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. We show conceptually and experimentally that the proposed approaches are more stable than two existing popular methods: voting and [3].


A Mixed-Signal VLSI for Real-Time Generation of Edge-Based Image Vectors

Neural Information Processing Systems

A mixed-signal image filtering VLSI has been developed aiming at real-time generation of edge-based image vectors for robust image recognition. A four-stage asynchronous median detection architecture basedon analog digital mixed-signal circuits has been introduced todetermine the threshold value of edge detection, the key processing parameter in vector generation. As a result, a fully seamless pipeline processing from threshold detection to edge feature mapgeneration has been established. A prototype chip was designed in a 0.35-ยตm double-polysilicon three-metal-layer CMOS technology and the concept was verified by the fabricated chip. The chip generates a 64-dimension feature vector from a 64x64-pixel gray scale image every 80ยตsec.


Log-Linear Models for Label Ranking

Neural Information Processing Systems

Label ranking is the task of inferring a total order over a predefined set of labels for each given instance. We present a general framework for batch learning of label ranking functions from supervised data. We assume that each instance in the training data is associated with a list of preferences over the label-set, however we do not assume that this list is either complete orconsistent. This enables us to accommodate a variety of ranking problems. In contrast to the general form of the supervision, our goal is to learn a ranking function that induces a total order over the entire set of labels. Special cases of our setting are multilabel categorization and hierarchical classification. We present a general boosting-based learning algorithm for the label ranking problem and prove a lower bound on the progress of each boosting iteration. The applicability of our approach is demonstrated with a set of experiments on a large-scale text corpus.


Online Passive-Aggressive Algorithms

Neural Information Processing Systems

We present a unified view for online classification, regression, and uniclass problems.This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. A conversion of our main online algorithm to the setting of batch learning is also discussed. Theend result is new algorithms and accompanying loss bounds for the hinge-loss.



Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects

Neural Information Processing Systems

We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute "virtualsensors" of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects,including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets.