Goto

Collaborating Authors

 Genre


Concept Modeling with Superwords

arXiv.org Machine Learning

In information retrieval, a fundamental goal is to transform a document into concepts that are representative of its content. The term "representative" is in itself challenging to define, and various tasks require different granularities of concepts. In this paper, we aim to model concepts that are sparse over the vocabulary, and that flexibly adapt their content based on other relevant semantic information such as textual structure or associated image features. We explore a Bayesian nonparametric model based on nested beta processes that allows for inferring an unknown number of strictly sparse concepts. The resulting model provides an inherently different representation of concepts than a standard LDA (or HDP) based topic model, and allows for direct incorporation of semantic features. We demonstrate the utility of this representation on multilingual blog data and the Congressional Record.


Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

arXiv.org Machine Learning

Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image datasets.


Detecting lateral genetic material transfer

arXiv.org Artificial Intelligence

The bioinformatical methods to detect lateral gene transfer events are mainly based on functional coding DNA characteristics. In this paper, we propose the use of DNA traits not depending on protein coding requirements. We introduce several semilocal variables that depend on DNA primary sequence and that reflect thermodynamic as well as physico-chemical magnitudes that are able to tell apart the genome of different organisms. After combining these variables in a neural classificator, we obtain results whose power of resolution go as far as to detect the exchange of genomic material between bacteria that are phylogenetically close.


Evolutionary Computation in Astronomy and Astrophysics: A Review

arXiv.org Artificial Intelligence

In general Evolutionary Computation (EC) includes a number of optimization methods inspired by biological mechanisms of evolution. The methods catalogued in this area use the Darwinian principles of life evolution to produce algorithms that returns high quality solutions to hard-to-solve optimization problems. The main strength of EC is precisely that they provide good solutions even if the computational resources (e.g., running time) are limited. Astronomy and Astrophysics are two fields that often require optimizing problems of high complexity or analyzing a huge amount of data and the so-called complete optimization methods are inherently limited by the size of the problem/data. For instance, reliable analysis of large amounts of data is central to modern astrophysics and astronomical sciences in general. EC techniques perform well where other optimization methods are inherently limited (as complete methods applied to NP-hard problems), and in the last ten years, numerous proposals have come up that apply with greater or lesser success methodologies of evolutional computation to common engineering problems. Some of these problems, such as the estimation of non-lineal parameters, the development of automatic learning techniques, the implementation of control systems, or the resolution of multi-objective optimization problems, have had (and have) a special repercussion in the fields. For these reasons EC emerges as a feasible alternative for traditional methods. In this paper, we discuss some promising applications in this direction and a number of recent works in this area; the paper also includes a general description of EC to provide a global perspective to the reader and gives some guidelines of application of EC techniques for future research


Robust Spatio-Temporal Signal Recovery from Noisy Counts in Social Media

arXiv.org Artificial Intelligence

Many real-world phenomena can be represented by a spatio-temporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count its occurrences in social media. However, counting is plagued by sample bias, incomplete data, and, paradoxically, data scarcity -- issues inadequately addressed by prior work. We formulate signal recovery as a Poisson point process estimation problem. We explicitly incorporate human population bias, time delays and spatial distortions, and spatio-temporal regularization into the model to address the noisy count issues. We present an efficient optimization algorithm and discuss its theoretical properties. We show that our model is more accurate than commonly-used baselines. Finally, we present a case study on wildlife roadkill monitoring, where our model produces qualitatively convincing results.


Robust Nonnegative Matrix Factorization via $L_1$ Norm Regularization

arXiv.org Machine Learning

Nonnegative Matrix Factorization (NMF) is a widely used technique in many applications such as face recognition, motion segmentation, etc. It approximates the nonnegative data in an original high dimensional space with a linear representation in a low dimensional space by using the product of two nonnegative matrices. In many applications data are often partially corrupted with large additive noise. When the positions of noise are known, some existing variants of NMF can be applied by treating these corrupted entries as missing values. However, the positions are often unknown in many real world applications, which prevents the usage of traditional NMF or other existing variants of NMF. This paper proposes a Robust Nonnegative Matrix Factorization (RobustNMF) algorithm that explicitly models the partial corruption as large additive noise without requiring the information of positions of noise. In practice, large additive noise can be used to model outliers. In particular, the proposed method jointly approximates the clean data matrix with the product of two nonnegative matrices and estimates the positions and values of outliers/noise. An efficient iterative optimization algorithm with a solid theoretical justification has been proposed to learn the desired matrix factorization. Experimental results demonstrate the advantages of the proposed algorithm.


Coherence Functions with Applications in Large-Margin Classification Methods

arXiv.org Machine Learning

Support vector machines (SVMs) naturally embody sparseness due to their use of hinge loss functions. However, SVMs can not directly estimate conditional class probabilities. In this paper we propose and study a family of coherence functions, which are convex and differentiable, as surrogates of the hinge function. The coherence function is derived by using the maximum-entropy principle and is characterized by a temperature parameter. It bridges the hinge function and the logit function in logistic regression. The limit of the coherence function at zero temperature corresponds to the hinge function, and the limit of the minimizer of its expected error is the minimizer of the expected error of the hinge loss. We refer to the use of the coherence function in large-margin classification as C-learning, and we present efficient coordinate descent algorithms for the training of regularized ${\cal C}$-learning models.


Publishing Identifiable Experiment Code And Configuration Is Important, Good and Easy

arXiv.org Artificial Intelligence

A few months ago, a graduate student in another country called me (Vaughan) to ask for the source code of one of my multi-robot simulation experiments. The student had an idea for a modification that she thought would improve the system's performance. By the standards of scientific practice this was a perfectly reasonable request and I felt obliged to give it to her. With our original code, the student could (i) rerun our experiments to verify that we reported the results correctly; (ii) inspect the code to make sure that it actually implements the algorithm described in our paper; (iii) change parameters and initial conditions to make sure our results were not a fluke of the particular experimental setting; (iv) modify the robot controllers and quantitatively compare her new method with our originals. It would cost me nothing to make her a copy of our code, and her methodology would be impeccable. Why then do we read so few papers using this methodology? It turned out to be impossible to identify exactly which code was used to perform the experiments in our years-old paper. We had not labeled the source code at that moment, and it had subsequently been modified. All the code was under version control, so we could obtain approximately the right code by looking at revision dates.


Knapsack based Optimal Policies for Budget-Limited Multi-Armed Bandits

arXiv.org Artificial Intelligence

In budget-limited multi-armed bandit (MAB) problems, the learner's actions are costly and constrained by a fixed budget. Consequently, an optimal exploitation policy may not be to pull the optimal arm repeatedly, as is the case in other variants of MAB, but rather to pull the sequence of different arms that maximises the agent's total reward within the budget. This difference from existing MABs means that new approaches to maximising the total reward are required. Given this, we develop two pulling policies, namely: (i) KUBE; and (ii) fractional KUBE. Whereas the former provides better performance up to 40% in our experimental settings, the latter is computationally less expensive. We also prove logarithmic upper bounds for the regret of both policies, and show that these bounds are asymptotically optimal (i.e. they only differ from the best possible regret by a constant factor).


Estimation of causal orders in a linear non-Gaussian acyclic model: a method robust against latent confounders

arXiv.org Machine Learning

We consider to learn a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal orders that is robust against one typical violation of the model assumptions: latent confounders. We demonstrate the effectiveness of our method using artificial data.