AITopics

1302.3407

Country:

North America > United States (0.14)
Europe (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Narayanan, Ajit, Chen, Yi

Bio-inspired data mining: Treating malware signatures as biosequences

arXiv.org Machine LearningFeb-14-2013

The application of machine learning to bioinformatics problems is well established. Less well understood is the application of bioinformatics techniques to machine learning and, in particular, the representation of non-biological data as biosequences. The aim of this paper is to explore the effects of giving amino acid representation to problematic machine learning data and to evaluate the benefits of supplementing traditional machine learning with bioinformatics tools and techniques. The signatures of 60 computer viruses and 60 computer worms were converted into amino acid representations and first multiply aligned separately to identify conserved regions across different families within each class (virus and worm). This was followed by a second alignment of all 120 aligned signatures together so that non-conserved regions were identified prior to input to a number of machine learning techniques. Differences in length between virus and worm signatures after the first alignment were resolved by the second alignment. Our first set of experiments indicates that representing computer malware signatures as amino acid sequences followed by alignment leads to greater classification and prediction accuracy. Our second set of experiments indicates that checking the results of data mining from artificial virus and worm data against known proteins can lead to generalizations being made from the domain of naturally occurring proteins to malware signatures. However, further work is needed to determine the advantages and disadvantages of different representations and sequence alignment methods for handling problematic machine learning data.

decision tree learning, immunology, signature, (24 more...)

1302.3668

Country:

Oceania > New Zealand (0.28)
Asia > Japan (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > Experimental Study (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(3 more...)

Bunea, Florentina, She, Yiyuan, Wegkamp, Marten H.

Joint variable and rank selection for parsimonious estimation of high-dimensional matrices

arXiv.org Machine LearningFeb-13-2013

We propose dimension reduction methods for sparse, high-dimensional multivariate response regression models. Both the number of responses and that of the predictors may exceed the sample size. Sometimes viewed as complementary, predictor selection and rank reduction are the most popular strategies for obtaining lower-dimensional approximations of the parameter matrix in such models. We show in this article that important gains in prediction accuracy can be obtained by considering them jointly. We motivate a new class of sparse multivariate regression models, in which the coefficient matrix has low rank and zero rows or can be well approximated by such a matrix. Next, we introduce estimators that are based on penalized least squares, with novel penalties that impose simultaneous row and rank restrictions on the coefficient matrix. We prove that these estimators indeed adapt to the unknown matrix sparsity and have fast rates of convergence. We support our theoretical results with an extensive simulation study and two data analyses.

estimator, neurology, survey article, (20 more...)

doi: 10.1214/12-AOS1039

1110.3556

Country: North America > United States > Florida (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Laskey, Kathryn Blackmond, Martignon, Laura

Bayesian Learning of Loglinear Models for Neural Connectivity

arXiv.org Machine LearningFeb-13-2013

This paper presents a Bayesian approach to learning the connectivity structure of a group of neurons from data on configuration frequencies. A major objective of the research is to provide statistical tools for detecting changes in firing patterns with changing stimuli. Our framework is not restricted to the well-understood case of pair interactions, but generalizes the Boltzmann machine model to allow for higher order interactions. The paper applies a Markov Chain Monte Carlo Model Composition (MC3) algorithm to search over connectivity structures and uses Laplace's method to approximate posterior probabilities of structures. Performance of the methods was tested on synthetic data. The models were also applied to data obtained by Vaadia on multi-unit recordings of several neurons in the visual cortex of a rhesus monkey in two different attentional states. Results confirmed the experimenters' conjecture that different attentional states were associated with different interaction structures.

bayesian inference, interaction, neurology, (20 more...)

1302.359

Country:

North America > United States > New York (0.14)
Asia > Middle East > Israel (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Friedman, Nir, Yakhini, Zohar

On the Sample Complexity of Learning Bayesian Networks

arXiv.org Machine LearningFeb-13-2013

In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an epsilon-close approximation (in terms of entropy distance) with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog (1/delta)). This means that the sample complexity is a low-order polynomial in the error threshold and sub-linear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. Finally, we address questions of asymptotic minimality and propose a method for using the sample complexity results to speed up the learning process.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1302.3579

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Castillo, Enrique F., Solares, Cristina, Gomez, Patricia

Tail Sensitivity Analysis in Bayesian Networks

The paper presents an efficient method for simulating the tails of a target variable Z=h(X) which depends on a set of basic variables X=(X_1, ..., X_n). To this aim, variables X_i, i=1, ..., n are sequentially simulated in such a manner that Z=h(x_1, ..., x_i-1, X_i, ..., X_n) is guaranteed to be in the tail of Z. When this method is difficult to apply, an alternative method is proposed, which leads to a low rejection proportion of sample values, when compared with the Monte Carlo method. Both methods are shown to be very useful to perform a sensitivity analysis of Bayesian networks, when very large confidence intervals for the marginal/conditional probabilities are required, as in reliability or risk analysis. The methods are shown to behave best when all scores coincide. The required modifications for this to occur are discussed. The methods are illustrated with several examples and one example of application to a real case is used to illustrate the whole process.

artificial intelligence, bayesian inference, rejection proportion, (16 more...)

1302.3564

Country: Europe > Spain > Cantabria (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Benferhat, Salem, Dubois, Didier, Prade, Henri

Coping with the Limitations of Rational Inference in the Framework of Possibility Theory

Possibility theory offers a framework where both Lehmann's "preferential inference" and the more productive (but less cautious) "rational closure inference" can be represented. However, there are situations where the second inference does not provide expected results either because it cannot produce them, or even provide counter-intuitive conclusions. This state of facts is not due to the principle of selecting a unique ordering of interpretations (which can be encoded by one possibility distribution), but rather to the absence of constraints expressing pieces of knowledge we have implicitly in mind. It is advocated in this paper that constraints induced by independence information can help finding the right ordering of interpretations. In particular, independence constraints can be systematically assumed with respect to formulas composed of literals which do not appear in the conditional knowledge base, or for default rules with respect to situations which are "normal" according to the other default rules in the base. The notion of independence which is used can be easily expressed in the qualitative setting of possibility theory. Moreover, when a counter-intuitive plausible conclusion of a set of defaults, is in its rational closure, but not in its preferential closure, it is always possible to repair the set of defaults so as to produce the desired conclusion.

artificial intelligence, nonmonotonic reasoning, possibility distribution, (16 more...)

1302.3559

Country:

North America > United States (0.14)
Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Nonmonotonic Logic (0.93)

Dubois, Didier, Prade, Henri

Belief Revision with Uncertain Inputs in the Possibilistic Setting

This paper discusses belief revision under uncertain inputs in the framework of possibility theory. Revision can be based on two possible definitions of the conditioning operation, one based on min operator which requires a purely ordinal scale only, and another based on product, for which a richer structure is needed, and which is a particular case of Dempster's rule of conditioning. Besides, revision under uncertain inputs can be understood in two different ways depending on whether the input is viewed, or not, as a constraint to enforce. Moreover, it is shown that M.A. Williams' transmutations, originally defined in the setting of Spohn's functions, can be captured in this framework, as well as Boutilier's natural revision.

artificial intelligence, belief revision, possibility distribution, (13 more...)

1302.3575

Country:

Europe > France (0.14)
North America > United States (0.14)
North America > Canada (0.14)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)

Topological Parameters for Time-Space Tradeoff

Dechter, Rina

In this paper we propose a family of algorithms combining tree-clustering with conditioning that trade space for time. Such algorithms are useful for reasoning in probabilistic and deterministic networks as well as for accomplishing optimization tasks. By analyzing the problem structure it will be possible to select from a spectrum the algorithm that best meets a given time-space specification.

artificial intelligence, constraint-based reasoning, graph, (18 more...)

1302.3573

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.32)

Propagation of 2-Monotone Lower Probabilities on an Undirected Graph

Chrisman, Lonnie

Lower and upper probabilities, also known as Choquet capacities, are widely used as a convenient representation for sets of probability distributions. This paper presents a graphical decomposition and exact propagation algorithm for computing marginal posteriors of 2-monotone lower probabilities (equivalently, 2-alternating upper probabilities).

artificial intelligence, bayesian inference, probability, (18 more...)

1302.3569

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)