AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning

Di Franco, Anthony

arXiv.org Artificial IntelligenceJan-17-2010

The aim of this work is to address the question of whether we can in principle design rational decision-making agents or artificial intelligences embedded in computable physics such that their decisions are optimal in reasonable mathematical senses. Recent developments in rare event probability estimation, recursive bayesian inference, neural networks, and probabilistic planning are sufficient to explicitly approximate reinforcement learners of the AIXI style with non-trivial model classes (here, the class of resource-bounded Turing machines). Consideration of the effects of resource limitations in a concrete implementation leads to insights about possible architectures for learning systems using optimal decision makers as components.

artificial intelligence, machine learning, sequence, (12 more...)

arXiv.org Artificial Intelligence

1001.2813

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

Client-server multi-task learning from distributed datasets

Dinuzzo, Francesco, Pillonetto, Gianluigi, De Nicolao, Giuseppe

arXiv.org Artificial IntelligenceJan-11-2010

A client-server architecture to simultaneously solve multiple learning tasks from distributed datasets is described. In such architecture, each client is associated with an individual learning task and the associated dataset of examples. The goal of the architecture is to perform information fusion from multiple datasets while preserving privacy of individual data. The role of the server is to collect data in real-time from the clients and codify the information in a common database. The information coded in this database can be used by all the clients to solve their individual learning task, so that each client can exploit the informative content of all the datasets without actually having access to private data of others. The proposed algorithmic framework, based on regularization theory and kernel methods, uses a suitable class of mixed effect kernels. The new method is illustrated through a simulated music recommendation system.

artificial intelligence, machine learning, proceedings, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNN.2010.2095882

0812.4235

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures

Tan, Vincent Y. F., Anandkumar, Animashree, Willsky, Alan S.

arXiv.org Machine LearningJan-4-2010

The problem of learning tree-structured Gaussian graphical models from independent and identically distributed (i.i.d.) samples is considered. The influence of the tree structure and the parameters of the Gaussian distribution on the learning rate as the number of samples increases is discussed. Specifically, the error exponent corresponding to the event that the estimated tree structure differs from the actual unknown tree structure of the distribution is analyzed. Finding the error exponent reduces to a least-squares problem in the very noisy learning regime. In this regime, it is shown that the extremal tree structure that minimizes the error exponent is the star for any fixed set of correlation coefficients on the edges of the tree. If the magnitudes of all the correlation coefficients are less than 0.63, it is also shown that the tree structure that maximizes the error exponent is the Markov chain. In other words, the star and the chain graphs represent the hardest and the easiest structures to learn in the class of tree-structured Gaussian graphical models. This result can also be intuitively explained by correlation decay: pairs of nodes which are far apart, in terms of graph distance, are unlikely to be mistaken as edges by the maximum-likelihood estimator in the asymptotic regime.

artificial intelligence, correlation coefficient, machine learning, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2010.2042478

0909.5216

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian orthogonal component analysis for sparse representation

Dobigeon, Nicolas, Tourneret, Jean-Yves

arXiv.org Machine LearningJan-4-2010

This paper addresses the problem of identifying a lower dimensional space where observed data can be sparsely represented. This under-complete dictionary learning task can be formulated as a blind separation problem of sparse sources linearly mixed with an unknown orthogonal mixing matrix. This issue is formulated in a Bayesian framework. First, the unknown sparse sources are modeled as Bernoulli-Gaussian processes. To promote sparsity, a weighted mixture of an atom at zero and a Gaussian distribution is proposed as prior distribution for the unobserved sources. A non-informative prior distribution defined on an appropriate Stiefel manifold is elected for the mixing matrix. The Bayesian inference on the unknown parameters is conducted using a Markov chain Monte Carlo (MCMC) method. A partially collapsed Gibbs sampler is designed to generate samples asymptotically distributed according to the joint posterior distribution of the unknown model parameters and hyperparameters. These samples are then used to approximate the joint maximum a posteriori estimator of the sources and mixing matrix. Simulations conducted on synthetic data are reported to illustrate the performance of the method for recovering sparse representations. An application to sparse coding on under-complete dictionary is finally investigated.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2010.2041594

0908.4489

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning to Explore and Exploit in POMDPs

Cai, Chenghui, Liao, Xuejun, Carin, Lawrence

Neural Information Processing SystemsDec-31-2009

A fundamental objective in reinforcement learning is the maintenance of a proper balance between exploration and exploitation. This problem becomes more challenging when the agent can only partially observe the states of its environment. In this paper we propose a dual-policy method for jointly learning the agent behavior and the balance between exploration exploitation, in partially observable environments. The method subsumes traditional exploration, in which the agent takes actions to gather information about the environment, and active learning, in which the agent queries an oracle for optimal actions (with an associated cost for employing the oracle). The form of the employed exploration is dictated by the specific problem. Theoretical guarantees are provided concerning the optimality of the balancing of exploration and exploitation. The effectiveness of the method is demonstrated by experimental results on benchmark problems.

bayesian inference, exploration, upstream oil & gas, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Energy > Oil & Gas > Upstream (0.76)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Sharing Features among Dynamical Systems with Beta Processes

Fox, Emily, Jordan, Michael I., Sudderth, Erik B., Willsky, Alan S.

Neural Information Processing SystemsDec-31-2009

We propose a Bayesian nonparametric approach to relating multiple time series via a set of latent, dynamical behaviors. Using a beta process prior, we allow data-driven selection of the size of this set, as well as the pattern with which behaviors are shared among time series. Via the Indian buffet process representation of the beta process predictive distributions, we develop an exact Markov chain Monte Carlo inference method. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth/death proposals. We validate our sampling algorithm using several synthetic datasets, and also demonstrate promising unsupervised segmentation of visual motion capture data.

artificial intelligence, bayesian inference, beta process, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback

Bounds on marginal probability distributions

Mooij, Joris M., Kappen, Hilbert J.

Neural Information Processing SystemsDec-31-2009

We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables. The bound is obtained by propagating local bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. By construction, the method not only bounds the exact marginal probability distribution of a variable, but also its approximate Belief Propagation marginal ("belief"). Thus, apart from providing a practical means to calculate bounds on marginals, our contribution also lies in providing a better understanding of the error made by Belief Propagation. We show that our bound outperforms the state-of-the-art on some inference problems arising in medical diagnosis.

artificial intelligence, factor graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Netherlands (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Industry:

Energy > Oil & Gas (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process

Doshi-velez, Finale, Mohamed, Shakir, Ghahramani, Zoubin, Knowles, David A.

Neural Information Processing SystemsDec-31-2009

Nonparametric Bayesian models provide a framework for flexible probabilistic modelling of complex datasets. Unfortunately, Bayesian inference methods often require high-dimensional averages and can be slow to compute, especially with the potentially unbounded representations associated with nonparametric models. We address the challenge of scaling nonparametric Bayesian inference to the increasingly large datasets found in real-world applications, focusing on the case of parallelising inference in the Indian Buffet Process (IBP). Our approach divides a large data set between multiple processors. The processors use message passing to compute likelihoods in an asynchronous, distributed fashion and to propagate statistics about the global Bayesian posterior. This novel MCMC sampler is the first parallel inference scheme for IBP-based models, scaling to datasets orders of magnitude larger than had previously been possible.

artificial intelligence, machine learning, processor, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Generative versus discriminative training of RBMs for classification of fMRI images

Schmah, Tanya, Hinton, Geoffrey E., Small, Steven L., Strother, Stephen, Zemel, Richard S.

Neural Information Processing SystemsDec-31-2009

Neuroimaging datasets often have a very large number of voxels and a very small number of training cases, which means that overfitting of models for this data can become a very serious problem. Working with a set of fMRI images from a study on stroke recovery, we consider a classification task for which logistic regression performs poorly, even when L1-or L2-regularized. We show that much better discrimination can be achieved by fitting a generative model to each separate condition and then seeing which model is most likely to have generated the data. We compare discriminative training of exactly the same set of models, and we also consider convex blends of generative and discriminative training.

classification task, discriminative training, generative training, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.48)
Asia > Middle East > Jordan (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)

Add feedback

Psychiatry: Insights into depression through normative decision-making models

Huys, Quentin J., Vogelstein, Joshua, Dayan, Peter

Neural Information Processing SystemsDec-31-2009

Decision making lies at the very heart of many psychiatric diseases. It is also a central theoretical concern in a wide variety of fields and has undergone detailed, in-depth, analyses. We take as an example Major Depressive Disorder (MDD), applying insights from a Bayesian reinforcement learning framework. We focus on anhedonia and helplessness. Helplessness--a core element in the conceptualizations of MDD that has lead to major advances in its treatment, pharmacological and neurobiological understanding--is formalized as a simple prior over the outcome entropy of actions in uncertain environments.

depression, outcome distribution, slot machine, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback