Goto

Collaborating Authors

 Genre


The Lovasz-Bregman Divergence and connections to rank aggregation, clustering, and web ranking

arXiv.org Machine Learning

We extend the recently introduced theory of Lovasz-Bregman (LB) divergences (Iyer & Bilmes, 2012) in several ways. We show that they represent a distortion between a 'score' and an 'ordering', thus providing a new view of rank aggregation and order based clustering with interesting connections to web ranking. We show how the LB divergences have a number of properties akin to many permutation based metrics, and in fact have as special cases forms very similar to the Kendall-$\tau$ metric. We also show how the LB divergences subsume a number of commonly used ranking measures in information retrieval, like the NDCG and AUC. Unlike the traditional permutation based metrics, however, the LB divergence naturally captures a notion of "confidence" in the orderings, thus providing a new representation to applications involving aggregating scores as opposed to just orderings. We show how a number of recently used web ranking models are forms of Lovasz-Bregman rank aggregation and also observe that a natural form of Mallow's model using the LB divergence has been used as conditional ranking models for the 'Learning to Rank' problem.


Understanding the Predictive Power of Computational Mechanics and Echo State Networks in Social Media

arXiv.org Machine Learning

ABSTRACT There is a large amount of interest in understanding users of social media in order to predict their behavior in this space. Despite this interest, user predictability in social media is not well-understood. To examine this question, we consider a network of fifteen thousand users on Twitter over a seven week period. We apply two contrasting modeling paradigms: computational mechanics and echo state networks. Both methods attempt to model the behavior of users on the basis of their past behavior. We demonstrate that the behavior of users on Twitter can be well-modeled as processes with self-feedback. We find that the two modeling approaches perform very similarly for most users, but that they differ in performance on a small subset of the users. By exploring the properties of these performance-differentiated users, we highlight the challenges faced in applying predictive models to dynamic social data. I INTRODUCTION At the most abstract level, an individual using a social media service may be viewed as a computational agent [1].


Manopt, a Matlab toolbox for optimization on manifolds

arXiv.org Machine Learning

Optimization on manifolds is a rapidly developing branch of nonlinear optimization. Its focus is on problems where the smooth geometry of the search space can be leveraged to design efficient numerical algorithms. In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints. Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on. The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms. We aim particularly at reaching practitioners outside our field.


Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors

arXiv.org Machine Learning

Penalized regression is an attractive framework for variable selection problems. Often, variables possess a grouping structure, and the relevant selection problem is that of selecting groups, not individual variables. The group lasso has been proposed as a way of extending the ideas of the lasso to the problem of group selection. Nonconvex penalties such as SCAD and MCP have been proposed and shown to have several advantages over the lasso; these penalties may also be extended to the group selection problem, giving rise to group SCAD and group MCP methods. Here, we describe algorithms for fitting these models stably and efficiently.


Measuring the Directional Distance Between Fuzzy Sets

arXiv.org Artificial Intelligence

The measure of distance between two fuzzy sets is a fundamental tool within fuzzy set theory. However, current distance measures within the literature do not account for the direction of change between fuzzy sets; a useful concept in a variety of applications, such as Computing With Words. In this paper, we highlight this utility and introduce a distance measure which takes the direction between sets into account. We provide details of its application for normal and non-normal, as well as convex and non-convex fuzzy sets. We demonstrate the new distance measure using real data from the MovieLens dataset and establish the benefits of measuring the direction between fuzzy sets.


Geodesic-based Salient Object Detection

arXiv.org Artificial Intelligence

Saliency detection has been an intuitive way to provide useful cues for object detection and segmentation, as desired for many vision and graphics applications. In this paper, we provided a robust method for salient object detection and segmentation. Other than using various pixel-level contrast definitions, we exploited global image structures and proposed a new geodesic method dedicated for salient object detection. In the proposed approach, a new geodesic scheme, namely geodesic tunneling is proposed to tackle with textures and local chaotic structures. With our new geodesic approach, a geodesic saliency map is estimated in correspondence to spatial structures in an image. Experimental evaluation on a salient object benchmark dataset validated that our algorithm consistently outperformed a number of the state-of-art saliency methods, yielding higher precision and better recall rates. With the robust saliency estimation, we also present an unsupervised hierarchical salient object cut scheme simply using adaptive saliency thresholding, which attained the highest score in our F-measure test. We also applied our geodesic cut scheme to a number of image editing tasks as demonstrated in additional experiments.


The Fractal Dimension of SAT Formulas

arXiv.org Artificial Intelligence

Modern SAT solvers have experienced a remarkable progress on solving industrial instances. Most of the techniques have been developed after an intensive experimental testing process. Recently, there have been some attempts to analyze the structure of these formulas in terms of complex networks, with the long-term aim of explaining the success of these SAT solving techniques, and possibly improving them. We study the fractal dimension of SAT formulas, and show that most industrial families of formulas are self-similar, with a small fractal dimension. We also show that this dimension is not affected by the addition of learnt clauses. We explore how the dimension of a formula, together with other graph properties can be used to characterize SAT instances. Finally, we give empirical evidence that these graph properties can be used in state-of-the-art portfolios.


Likelihood Adaptively Modified Penalties

arXiv.org Machine Learning

A new family of penalty functions, adaptive to likelihood, is introduced for model selection in general regression models. It arises naturally through assuming certain types of prior distribution on the regression parameters. To study stability properties of the penalized maximum likelihood estimator, two types of asymptotic stability are defined. Theoretical properties, including the parameter estimation consistency, model selection consistency, and asymptotic stability, are established under suitable regularity conditions. An efficient coordinate-descent algorithm is proposed. Simulation results and real data analysis show that the proposed method has competitive performance in comparison with existing ones.


Top-down particle filtering for Bayesian decision trees

arXiv.org Machine Learning

Decision tree learning is a popular approach for classification and regression in machine learning and statistics, and Bayesian formulations---which introduce a prior distribution over decision trees, and formulate learning as posterior inference given data---have been shown to produce competitive performance. Unlike classic decision tree learning algorithms like ID3, C4.5 and CART, which work in a top-down manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e.g., using Markov chain Monte Carlo (MCMC). We present a sequential Monte Carlo (SMC) algorithm that instead works in a top-down manner, mimicking the behavior and speed of classic algorithms. We demonstrate empirically that our approach delivers accuracy comparable to the most popular MCMC method, but operates more than an order of magnitude faster, and thus represents a better computation-accuracy tradeoff.


Validation of Soft Classification Models using Partial Class Memberships: An Extended Concept of Sensitivity & Co. applied to the Grading of Astrocytoma Tissues

arXiv.org Machine Learning

We use partial class memberships in soft classification to model uncertain labelling and mixtures of classes. Partial class memberships are not restricted to predictions, but may also occur in reference labels (ground truth, gold standard diagnosis) for training and validation data. Classifier performance is usually expressed as fractions of the confusion matrix, such as sensitivity, specificity, negative and positive predictive values. We extend this concept to soft classification and discuss the bias and variance properties of the extended performance measures. Ambiguity in reference labels translates to differences between best-case, expected and worst-case performance. We show a second set of measures comparing expected and ideal performance which is closely related to regression performance, namely the root mean squared error RMSE and the mean absolute error MAE. All calculations apply to classical crisp classification as well as to soft classification (partial class memberships and/or one-class classifiers). The proposed performance measures allow to test classifiers with actual borderline cases. In addition, hardening of e.g. posterior probabilities into class labels is not necessary, avoiding the corresponding information loss and increase in variance. We implement the proposed performance measures in the R package "softclassval", which is available from CRAN and at http://softclassval.r-forge.r-project.org. Our reasoning as well as the importance of partial memberships for chemometric classification is illustrated by a real-word application: astrocytoma brain tumor tissue grading (80 patients, 37000 spectra) for finding surgical excision borders. As borderline cases are the actual target of the analytical technique, samples which are diagnosed to be borderline cases must be included in the validation.