Goto

Collaborating Authors

 Country


A consistent clustering-based approach to estimating the number of change-points in highly dependent time-series

arXiv.org Machine Learning

The problem of change-point estimation is considered under a general framework where the data are generated by unknown stationary ergodic process distributions. In this context, the consistent estimation of the number of change-points is provably impossible. However, it is shown that a consistent clustering method may be used to estimate the number of change points, under the additional constraint that the correct number of process distributions that generate the data is provided. This additional parameter has a natural interpretation in many real-world applications. An algorithm is proposed that estimates the number of change-points and locates the changes. The proposed algorithm is shown to be asymptotically consistent; its empirical evaluations are provided.


Bio-inspired data mining: Treating malware signatures as biosequences

arXiv.org Machine Learning

The application of machine learning to bioinformatics problems is well established. Less well understood is the application of bioinformatics techniques to machine learning and, in particular, the representation of non-biological data as biosequences. The aim of this paper is to explore the effects of giving amino acid representation to problematic machine learning data and to evaluate the benefits of supplementing traditional machine learning with bioinformatics tools and techniques. The signatures of 60 computer viruses and 60 computer worms were converted into amino acid representations and first multiply aligned separately to identify conserved regions across different families within each class (virus and worm). This was followed by a second alignment of all 120 aligned signatures together so that non-conserved regions were identified prior to input to a number of machine learning techniques. Differences in length between virus and worm signatures after the first alignment were resolved by the second alignment. Our first set of experiments indicates that representing computer malware signatures as amino acid sequences followed by alignment leads to greater classification and prediction accuracy. Our second set of experiments indicates that checking the results of data mining from artificial virus and worm data against known proteins can lead to generalizations being made from the domain of naturally occurring proteins to malware signatures. However, further work is needed to determine the advantages and disadvantages of different representations and sequence alignment methods for handling problematic machine learning data.


Joint variable and rank selection for parsimonious estimation of high-dimensional matrices

arXiv.org Machine Learning

We propose dimension reduction methods for sparse, high-dimensional multivariate response regression models. Both the number of responses and that of the predictors may exceed the sample size. Sometimes viewed as complementary, predictor selection and rank reduction are the most popular strategies for obtaining lower-dimensional approximations of the parameter matrix in such models. We show in this article that important gains in prediction accuracy can be obtained by considering them jointly. We motivate a new class of sparse multivariate regression models, in which the coefficient matrix has low rank and zero rows or can be well approximated by such a matrix. Next, we introduce estimators that are based on penalized least squares, with novel penalties that impose simultaneous row and rank restrictions on the coefficient matrix. We prove that these estimators indeed adapt to the unknown matrix sparsity and have fast rates of convergence. We support our theoretical results with an extensive simulation study and two data analyses.


Bayesian Learning of Loglinear Models for Neural Connectivity

arXiv.org Machine Learning

This paper presents a Bayesian approach to learning the connectivity structure of a group of neurons from data on configuration frequencies. A major objective of the research is to provide statistical tools for detecting changes in firing patterns with changing stimuli. Our framework is not restricted to the well-understood case of pair interactions, but generalizes the Boltzmann machine model to allow for higher order interactions. The paper applies a Markov Chain Monte Carlo Model Composition (MC3) algorithm to search over connectivity structures and uses Laplace's method to approximate posterior probabilities of structures. Performance of the methods was tested on synthetic data. The models were also applied to data obtained by Vaadia on multi-unit recordings of several neurons in the visual cortex of a rhesus monkey in two different attentional states. Results confirmed the experimenters' conjecture that different attentional states were associated with different interaction structures.


On the Sample Complexity of Learning Bayesian Networks

arXiv.org Machine Learning

In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an epsilon-close approximation (in terms of entropy distance) with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog (1/delta)). This means that the sample complexity is a low-order polynomial in the error threshold and sub-linear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. Finally, we address questions of asymptotic minimality and propose a method for using the sample complexity results to speed up the learning process.


Tail Sensitivity Analysis in Bayesian Networks

arXiv.org Artificial Intelligence

The paper presents an efficient method for simulating the tails of a target variable Z=h(X) which depends on a set of basic variables X=(X_1, ..., X_n). To this aim, variables X_i, i=1, ..., n are sequentially simulated in such a manner that Z=h(x_1, ..., x_i-1, X_i, ..., X_n) is guaranteed to be in the tail of Z. When this method is difficult to apply, an alternative method is proposed, which leads to a low rejection proportion of sample values, when compared with the Monte Carlo method. Both methods are shown to be very useful to perform a sensitivity analysis of Bayesian networks, when very large confidence intervals for the marginal/conditional probabilities are required, as in reliability or risk analysis. The methods are shown to behave best when all scores coincide. The required modifications for this to occur are discussed. The methods are illustrated with several examples and one example of application to a real case is used to illustrate the whole process.


Coping with the Limitations of Rational Inference in the Framework of Possibility Theory

arXiv.org Artificial Intelligence

Possibility theory offers a framework where both Lehmann's "preferential inference" and the more productive (but less cautious) "rational closure inference" can be represented. However, there are situations where the second inference does not provide expected results either because it cannot produce them, or even provide counter-intuitive conclusions. This state of facts is not due to the principle of selecting a unique ordering of interpretations (which can be encoded by one possibility distribution), but rather to the absence of constraints expressing pieces of knowledge we have implicitly in mind. It is advocated in this paper that constraints induced by independence information can help finding the right ordering of interpretations. In particular, independence constraints can be systematically assumed with respect to formulas composed of literals which do not appear in the conditional knowledge base, or for default rules with respect to situations which are "normal" according to the other default rules in the base. The notion of independence which is used can be easily expressed in the qualitative setting of possibility theory. Moreover, when a counter-intuitive plausible conclusion of a set of defaults, is in its rational closure, but not in its preferential closure, it is always possible to repair the set of defaults so as to produce the desired conclusion.


Belief Revision with Uncertain Inputs in the Possibilistic Setting

arXiv.org Artificial Intelligence

This paper discusses belief revision under uncertain inputs in the framework of possibility theory. Revision can be based on two possible definitions of the conditioning operation, one based on min operator which requires a purely ordinal scale only, and another based on product, for which a richer structure is needed, and which is a particular case of Dempster's rule of conditioning. Besides, revision under uncertain inputs can be understood in two different ways depending on whether the input is viewed, or not, as a constraint to enforce. Moreover, it is shown that M.A. Williams' transmutations, originally defined in the setting of Spohn's functions, can be captured in this framework, as well as Boutilier's natural revision.


Topological Parameters for Time-Space Tradeoff

arXiv.org Artificial Intelligence

In this paper we propose a family of algorithms combining tree-clustering with conditioning that trade space for time. Such algorithms are useful for reasoning in probabilistic and deterministic networks as well as for accomplishing optimization tasks. By analyzing the problem structure it will be possible to select from a spectrum the algorithm that best meets a given time-space specification.


Propagation of 2-Monotone Lower Probabilities on an Undirected Graph

arXiv.org Artificial Intelligence

Lower and upper probabilities, also known as Choquet capacities, are widely used as a convenient representation for sets of probability distributions. This paper presents a graphical decomposition and exact propagation algorithm for computing marginal posteriors of 2-monotone lower probabilities (equivalently, 2-alternating upper probabilities).