AITopics

1202.6228

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Massam, Helene, Wang, Nanwei

Distributed parameter estimation of discrete hierarchical models via marginal likelihoods

arXiv.org Machine LearningOct-21-2013

We consider discrete graphical models Markov with respect to a graph $G$ and propose two distributed marginal methods to estimate the maximum likelihood estimate of the canonical parameter of the model. Both methods are based on a relaxation of the marginal likelihood obtained by considering the density of the variables represented by a vertex $v$ of $G$ and a neighborhood. The two methods differ by the size of the neighborhood of $v$. We show that the estimates are consistent and that those obtained with the larger neighborhood have smaller asymptotic variance than the ones obtained through the smaller neighborhood.

artificial intelligence, machine learning, variance, (18 more...)

1310.5666

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Park, Il Memming, Seth, Sohan, Van Vaerenbergh, Steven

Bayesian Extensions of Kernel Least Mean Squares

arXiv.org Machine LearningOct-20-2013

The kernel least mean squares (KLMS) algorithm is a computationally efficient nonlinear adaptive filtering method that "kernelizes" the celebrated (linear) least mean squares algorithm. We demonstrate that the least mean squares algorithm is closely related to the Kalman filtering, and thus, the KLMS can be interpreted as an approximate Bayesian filtering method. This allows us to systematically develop extensions of the KLMS by modifying the underlying state-space and observation models. The resulting extensions introduce many desirable properties such as "forgetting", and the ability to learn from discrete data, while retaining the computational simplicity and time complexity of the original algorithm.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1310.5347

Country:

North America > United States (0.68)
Europe (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Journal of Artificial Intelligence ResearchOct-16-2013

Learning Optimal Bayesian Networks: A Shortest Path Perspective

Yuan, C., Malone, B.

In this paper, learning a Bayesian network structure that optimizes a scoring function for a given dataset is viewed as a shortest path problem in an implicit state-space search graph. This perspective highlights the importance of two research issues: the development of search strategies for solving the shortest path problem, and the design of heuristic functions for guiding the search. This paper introduces several techniques for addressing the issues. One is an A* search algorithm that learns an optimal Bayesian network structure by only searching the most promising part of the solution space. The others are mainly two heuristic functions. The first heuristic function represents a simple relaxation of the acyclicity constraint of a Bayesian network. Although admissible and consistent, the heuristic may introduce too much relaxation and result in a loose bound. The second heuristic function reduces the amount of relaxation by avoiding directed cycles within some groups of variables. Empirical results show that these methods constitute a promising approach to learning optimal Bayesian network structures.

algorithm, bayesian network, pattern database, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4039

AI Access Foundation

10835

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > New York > Queens County > New York City (0.14)
(7 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gillberg, Jussi, Marttinen, Pekka, Pirinen, Matti, Kangas, Antti J, Soininen, Pasi, Järvelin, Marjo-Riitta, Ala-Korpela, Mika, Kaski, Samuel

Bayesian Information Sharing Between Noise And Regression Models Improves Prediction of Weak Effects

We consider the prediction of weak effects in a multiple-output regression setup, when covariates are expected to explain a small amount, less than $\approx 1%$, of the variance of the target variables. To facilitate the prediction of the weak effects, we constrain our model structure by introducing a novel Bayesian approach of sharing information between the regression model and the noise model. Further reduction of the effective number of parameters is achieved by introducing an infinite shrinkage prior and group sparsity in the context of the Bayesian reduced rank regression, and using the Bayesian infinite factor model as a flexible low-rank noise model. In our experiments the model incorporating the novelties outperformed alternatives in genomic prediction of rich phenotype data. In particular, the information sharing between the noise and regression models led to significant improvement in prediction accuracy.

artificial intelligence, machine learning, noise model, (16 more...)

1310.4362

Country:

Europe > Finland (0.72)
Europe > United Kingdom (0.46)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Inference, Sampling, and Learning in Copula Cumulative Distribution Networks

Webb, Stefan Douglas

The cumulative distribution network (CDN) is a recently developed class of probabilistic graphical models (PGMs) permitting a copula factorization, in which the CDF, rather than the density, is factored. Despite there being much recent interest within the machine learning community about copula representations, there has been scarce research into the CDN, its amalgamation with copula theory, and no evaluation of its performance. Algorithms for inference, sampling, and learning in these models are underdeveloped compared those of other PGMs, hindering widerspread use. One advantage of the CDN is that it allows the factors to be parameterized as copulae, combining the benefits of graphical models with those of copula theory. In brief, the use of a copula parameterization enables greater modelling flexibility by separating representation of the marginals from the dependence structure, permitting more efficient and robust learning. Another advantage is that the CDN permits the representation of implicit latent variables, whose parameterization and connectivity are not required to be specified. Unfortunately, that the model can encode only latent relationships between variables severely limits its utility. In this thesis, we present inference, learning, and sampling for CDNs, and further the state-of-the-art. First, we explain the basics of copula theory and the representation of copula CDNs. Then, we discuss inference in the models, and develop the first sampling algorithm. We explain standard learning methods, propose an algorithm for learning from data missing completely at random (MCAR), and develop a novel algorithm for learning models of arbitrary treewidth and size. Properties of the models and algorithms are investigated through Monte Carlo simulations. We conclude with further discussion of the advantages and limitations of CDNs, and suggest future work.

algorithm, artificial intelligence, machine learning, (19 more...)

1310.4456

Country: Europe (0.27)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Supervised Heterogeneous Multiview Learning for Joint Association Study and Disease Diagnosis

Zhe, Shandian, Xu, Zenglin, Qi, Yuan

Given genetic variations and various phenotypical traits, such as Magnetic Resonance Imaging (MRI) features, we consider two important and related tasks in biomedical research: i)to select genetic and phenotypical markers for disease diagnosis and ii) to identify associations between genetic and phenotypical data. These two tasks are tightly coupled because underlying associations between genetic variations and phenotypical features contain the biological basis for a disease. While a variety of sparse models have been applied for disease diagnosis and canonical correlation analysis and its extensions have bee widely used in association studies (e.g., eQTL analysis), these two tasks have been treated separately. To unify these two tasks, we present a new sparse Bayesian approach for joint association study and disease diagnosis. In this approach, common latent features are extracted from different data sources based on sparse projection matrices and used to predict multiple disease severity levels based on Gaussian process ordinal regression; in return, the disease status is used to guide the discovery of relationships between the data sources. The sparse projection matrices not only reveal interactions between data sources but also select groups of biomarkers related to the disease. To learn the model from data, we develop an efficient variational expectation maximization algorithm. Simulation results demonstrate that our approach achieves higher accuracy in both predicting ordinal labels and discovering associations between data sources than alternative methods. We apply our approach to an imaging genetics dataset for the study of Alzheimer's Disease (AD). Our method identifies biologically meaningful relationships between genetic variations, MRI features, and AD status, and achieves significantly higher accuracy for predicting ordinal AD stages than the competing methods.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

1304.7284

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.55)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Atiya, Amir F., Fayed, Hatem A., Abdel-Gawad, Ahmed H.

A New Monte Carlo Based Algorithm for the Gaussian Process Classification Problem

Gaussian process is a very promising novel technology that has been applied to both the regression problem and the classification problem. While for the regression problem it yields simple exact solutions, this is not the case for the classification problem, because we encounter intractable integrals. In this paper we develop a new derivation that transforms the problem into that of evaluating the ratio of multivariate Gaussian orthant integrals. Moreover, we develop a new Monte Carlo procedure that evaluates these integrals. It is based on some aspects of bootstrap sampling and acceptancerejection. The proposed approach has beneficial properties compared to the existing Markov Chain Monte Carlo approach, such as simplicity, reliability, and speed.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1302.722

Country:

Europe > United Kingdom > England (0.28)
North America > United States > California (0.28)

Genre:

Research Report (0.50)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Moreno, E., Vazquez-Polo, F. -J., Robert, C. P.

Two discussions of the paper "Bayesian measures of model complexity and fit" by D. Spiegelhalter et al., Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, March 13th, 2002

arXiv.org Machine LearningOct-11-2013

These are the written discussions of the paper "Bayesian measures of model complexity and fit" by D. Spiegelhalter et al. (2002), following the discussions given at the Annual Meeting of the Royal Statistical Society in Newcastle-upon-Tyne on September 3rd, 2013.

artificial intelligence, machine learning, procedure, (14 more...)

1310.2905

Country:

Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.24)
Europe > Spain > Canary Islands > Gran Canaria (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.53)

arXiv.org Machine LearningOct-10-2013

Gibbs Max-margin Topic Models with Data Augmentation

Zhu, Jun, Chen, Ning, Perkins, Hugh, Zhang, Bo

Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for unseen testing data. However, the resulting learning problems are usually hard to solve because of the non-smoothness of the margin loss. Existing approaches to building max-margin supervised topic models rely on an iterative procedure to solve multiple latent SVM subproblems with additional mean-field assumptions on the desired posterior distributions. This paper presents an alternative approach by defining a new max-margin loss. Namely, we present Gibbs max-margin supervised topic models, a latent variable Gibbs classifier to discover hidden topic representations for various tasks, including classification, regression and multi-task learning. Gibbs max-margin supervised topic models minimize an expected margin loss, which is an upper bound of the existing margin loss derived from an expected prediction rule. By introducing augmented variables and integrating out the Dirichlet variables analytically by conjugacy, we develop simple Gibbs sampling algorithms with no restricting assumptions and no need to solve SVM subproblems. Furthermore, each step of the "augment-and-collapse" Gibbs sampling algorithms has an analytical conditional distribution, from which samples can be easily drawn. Experimental results demonstrate significant improvements on time efficiency. The classification performance is also significantly improved over competitors on binary, multi-class and multi-label classification tasks.

classifier, machine learning, natural language, (17 more...)

1310.2816

Country:

Asia (1.00)
North America > United States (0.67)

Genre: Research Report > New Finding (0.48)

Industry:

Government (0.46)
Information Technology (0.45)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)