AITopics

1104.158

Country:

North America > United States > California (0.28)
North America > United States > New York > New York County > New York City (0.15)
North America > Mexico (0.14)
(8 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Space Agency (0.69)
Health & Medicine (0.67)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
(3 more...)

Henao, Ricardo, Winther, Ole

Sparse Linear Identifiable Multivariate Modeling

arXiv.org Machine LearningJun-23-2011

In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.

artificial intelligence, factor model, machine learning, (19 more...)

1004.5265

Country: Europe > Denmark (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJun-23-2011

Exploiting Contextual Independence In Probabilistic Inference

Poole, D., Zhang, N. L.

Bayesian belief networks have grown to prominence because they provide compact representations for many problems for which probabilistic inference is appropriate, and there are algorithms to exploit this compactness. The next step is to allow compact representations of the conditional probabilities of a variable given its parents. In this paper we present such a representation that exploits contextual independence in terms of parent contexts; which variables act as parents may depend on the value of other variables. The internal representation is in terms of contextual factors (confactors) that is simply a pair of a context and a table. The algorithm, contextual variable elimination, is based on the standard variable elimination algorithm that eliminates the non-query variables in turn, but when eliminating a variable, the tables that need to be multiplied can depend on the context. This algorithm reduces to standard variable elimination when there is no contextual independence structure to exploit. We show how this can be much more efficient than variable elimination when there is structure to exploit. We explain why this new method can exploit more structure than previous methods for structured belief network inference and an analogous algorithm that uses trees.

artificial intelligence, confactor, machine learning, (15 more...)

doi: 10.1613/jair.1122

1106.4864

Country:

North America > Canada (0.93)
North America > United States (0.67)

Genre: Research Report (0.64)

Industry: Water & Waste Management (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Jylänki, Pasi, Vanhatalo, Jarno, Vehtari, Aki

Gaussian Process Regression with a Student-t Likelihood

arXiv.org Machine LearningJun-22-2011

This paper considers the robust and efficient implementation of Gaussian process regression with a Student-t observation model. The challenge with the Student-t model is the analytically intractable inference which is why several approximative methods have been proposed. The expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of the EP is known to be problematic with models containing non-log-concave site functions such as the Student-t distribution. In this paper we illustrate the situations where the standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the type-II maximum a posteriori (MAP) estimation of the hyperparameters and show that the standard EP may not converge in the MAP values in some difficult cases. We present a robust implementation which relies primarily on parallel EP updates and utilizes a moment-matching-based double-loop algorithm with adaptively selected step size in difficult cases. The predictive performance of the EP is compared to the Laplace, variational Bayes, and Markov chain Monte Carlo approximations.

approximation, artificial intelligence, machine learning, (16 more...)

1106.4431

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

arXiv.org Machine LearningJun-22-2011

Machine Learning Markets

Storkey, Amos

Prediction markets show considerable promise for developing flexible mechanisms for machine learning. Here, machine learning markets for multivariate systems are defined, and a utility-based framework is established for their analysis. This differs from the usual approach of defining static betting functions. It is shown that such markets can implement model combination methods used in machine learning, such as product of expert and mixture of expert approaches as equilibrium pricing models, by varying agent utility functions. They can also implement models composed of local potentials, and message passing methods. Prediction markets also allow for more flexible combinations, by combining multiple different utility functions. Conversely, the market mechanisms implement inference in the relevant probabilistic models. This means that market mechanism can be utilized for implementing parallelized model building and inference for probabilistic modelling.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1106.4509

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading > Prediction Market (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kalaitzis, Alfredo A., Lawrence, Neil D.

Residual Component Analysis

arXiv.org Artificial IntelligenceJun-21-2011

Probabilistic principal component analysis (PPCA) seeks a low dimensional representation of a data set in the presence of independent spherical Gaussian noise, Sigma = (sigma^2)*I. The maximum likelihood solution for the model is an eigenvalue problem on the sample covariance matrix. In this paper we consider the situation where the data variance is already partially explained by other factors, e.g. covariates of interest, or temporal correlations leaving some residual variance. We decompose the residual variance into its components through a generalized eigenvalue problem, which we call residual component analysis (RCA). We show that canonical covariates analysis (CCA) is a special case of our algorithm and explore a range of new algorithms that arise from the framework. We illustrate the ideas on a gene expression time series data set and the recovery of human pose from silhouette.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1106.4333

Country: North America > United States (0.47)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Bellenger, Amandine, Gatepaille, Sylvain

Uncertainty in Ontologies: Dempster-Shafer Theory for Data Fusion Applications

arXiv.org Artificial IntelligenceJun-20-2011

Nowadays ontologies present a growing interest in Data Fusion applications. As a matter of fact, the ontologies are seen as a semantic tool for describing and reasoning about sensor data, objects, relations and general domain theories. In addition, uncertainty is perhaps one of the most important characteristics of the data and information handled by Data Fusion. However, the fundamental nature of ontologies implies that ontologies describe only asserted and veracious facts of the world. Different probabilistic, fuzzy and evidential approaches already exist to fill this gap; this paper recaps the most popular tools. However none of the tools meets exactly our purposes. Therefore, we constructed a Dempster-Shafer ontology that can be imported into any specific domain ontology and that enables us to instantiate it in an uncertain manner. We also developed a Java application that enables reasoning about these uncertain ontological instances.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1106.3876

Country: Europe > Germany (0.28)

Genre: Research Report (0.40)

Industry: Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Artificial IntelligenceJun-9-2011

Parameter Learning of Logic Programs for Symbolic-Statistical Modeling

Sato, T., Kameya, Y.

We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics, possible world semantics with a probability distribution which is unconditionally applicable to arbitrary logic programs including ones for HMMs, PCFGs and Bayesian networks. We also propose a new EM algorithm, the graphical EM algorithm, that runs for a class of parameterized logic programs representing sequential decision processes where each decision is exclusive and independent. It runs on a new data structure called support graphs describing the logical relationship between observations and their explanations, and learns parameters by computing inside and outside probability generalized for logic programs. The complexity analysis shows that when combined with OLDT search for all explanations for observations, the graphical EM algorithm, despite its generality, has the same time complexity as existing EM algorithms, i.e. the Baum-Welch algorithm for HMMs, the Inside-Outside algorithm for PCFGs, and the one for singly connected Bayesian networks that have been developed independently in each research field. Learning experiments with PCFGs using two corpora of moderate size indicate that the graphical EM algorithm can significantly outperform the Inside-Outside algorithm.

logic & formal reasoning, machine learning, natural language, (19 more...)

doi: 10.1613/jair.912

1106.1797

Country: Asia > Japan (0.28)

Genre:

Research Report (0.49)
Instructional Material (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJun-3-2011

Accelerating Reinforcement Learning through Implicit Imitation

Boutilier, C., Price, B.

Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments an agent's ability to learn useful behaviors by making intelligent use of the knowledge implicit in behaviors demonstrated by cooperative teachers or other more experienced agents. We propose and study a formal model of implicit imitation that can accelerate reinforcement learning dramatically in certain cases. Roughly, by observing a mentor, a reinforcement-learning agent can extract information about its own capabilities in, and the relative value of, unvisited parts of the state space. We study two specific instantiations of this model, one in which the learning agent and the mentor have identical abilities, and one designed to deal with agents and mentors with different action sets. We illustrate the benefits of implicit imitation by integrating it with prioritized sweeping, and demonstrating improved performance and convergence through observation of single and multiple mentors. Though we make some stringent assumptions regarding observability and possible interactions, we briefly comment on extensions of the model that relax these restricitions.

agent, mentor, observer, (16 more...)

doi: 10.1613/jair.898

1106.0681

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York (0.04)
(17 more...)

Genre: Research Report (0.81)

Industry:

Education (0.46)
Energy (0.46)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Cheng, J., Druzdzel, M. J.

AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks

arXiv.org Artificial IntelligenceJun-1-2011

Stochastic sampling algorithms, while an attractive alternative to exact algorithms in very large Bayesian network models, have been observed to perform poorly in evidential reasoning with extremely unlikely evidence. To address this problem, we propose an adaptive importance sampling algorithm, AIS-BN, that shows promising convergence rates even under extreme conditions and seems to outperform the existing sampling algorithms consistently. Three sources of this performance improvement are (1) two heuristics for initialization of the importance function that are based on the theoretical properties of importance sampling in finite-dimensional integrals and the structural advantages of Bayesian networks, (2) a smooth learning method for the importance function, and (3) a dynamic weighting function for combining samples from different stages of the algorithm. We tested the performance of the AIS-BN algorithm along with two state of the art general purpose sampling algorithms, likelihood weighting (Fung and Chang, 1989; Shachter and Peot, 1989) and self-importance sampling (Shachter and Peot, 1989). We used in our tests three large real Bayesian network models available to the scientific community: the CPCS network (Pradhan et al., 1994), the PathFinder network (Heckerman, Horvitz, and Nathwani, 1990), and the ANDES network (Conati, Gertner, VanLehn, and Druzdzel, 1997), with evidence as unlikely as 10^-41. While the AIS-BN algorithm always performed better than the other two algorithms, in the majority of the test cases it achieved orders of magnitude improvement in precision of the results. Improvement in speed given a desired precision is even more dramatic, although we are unable to report numerical results here, as the other algorithms almost never achieved the precision reached even by the first few iterations of the AIS-BN algorithm.

adaptive importance sampling algorithm, artificial intelligence, machine learning, (6 more...)

doi: 10.1613/jair.764

1106.0253

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)