AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Reduction of Maximum Entropy Models to Hidden Markov Models

Goodman, Joshua

arXiv.org Artificial IntelligenceDec-12-2012

We show that maximum entropy (maxent) models can be modeled with certain kinds of HMMs, allowing us to construct maxent models with hidden variables, hidden state sequences, or other characteristics. The models can be trained using the forward-backward algorithm. While the results are primarily of theoretical interest, unifying apparently unrelated concepts, we also give experimental results for a maxent model with a hidden variable on a word disambiguation task; the model outperforms standard techniques.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.057

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Add feedback

Iterative Join-Graph Propagation

Dechter, Rina, Kask, Kalev, Mateescu, Robert

arXiv.org Artificial IntelligenceDec-12-2012

The paper presents an iterative version of join-tree clustering that applies the message passing of join-tree clustering algorithm to join-graphs rather than to join-trees, iteratively. It is inspired by the success of Pearl's belief propagation algorithm as an iterative approximation scheme on one hand, and by a recently introduced mini-clustering i. success as an anytime approximation method, on the other. The proposed Iterative Join-graph Propagation IJGP belongs to the class of generalized belief propagation methods, recently proposed using analogy with algorithms in statistical physics. Empirical evaluation of this approach on a number of problem classes demonstrates that even the most time-efficient variant is almost always superior to IBP and MC i, and is sometimes more accurate by as much as several orders of magnitude.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.0564

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Interpolating Conditional Density Trees

Davies, Scott, Moore, Andrew

arXiv.org Artificial IntelligenceDec-12-2012

Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are many datapoints and many continuous variables with complex nonlinear relationships, particularly when no good ways of decomposing the joint distribution are known a priori. In such situations, previous research has generally focused on the use of discretization techniques in which each continuous variable has a single discretization that is used throughout the entire network. \ In this paper, we present and compare a wide variety of tree-based algorithms for learning and evaluating conditional density estimates over continuous variables. These trees can be thought of as discretizations that vary according to the particular interactions being modeled; however, the density within a given leaf of the tree need not be assumed constant, and we show that such nonuniform leaf densities lead to more accurate density estimation. We have developed Bayesian network structure-learning algorithms that employ these tree-based conditional density representations, and we show that they can be used to practically learn complex joint probability models over dozens of continuous variables from thousands of datapoints. We focus on finding models that are simultaneously accurate, fast to learn, and fast to evaluate once they are learned.

artificial intelligence, density tree, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1301.0563

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Convex Relaxations for Learning Bounded Treewidth Decomposable Graphs

Kumar, K. S. Sesh, Bach, Francis

arXiv.org Machine LearningDec-11-2012

We consider the problem of learning the structure of undirected graphical models with bounded treewidth, within the maximum likelihood framework. This is an NP-hard problem and most approaches consider local search techniques. In this paper, we pose it as a combinatorial optimization problem, which is then relaxed to a convex optimization problem that involves searching over the forest and hyperforest polytopes with special structures, independently. A supergradient method is used to solve the dual problem, with a run-time complexity of $O(k^3 n^{k+2} \log n)$ for each iteration, where $n$ is the number of variables and $k$ is a bound on the treewidth. We compare our approach to state-of-the-art methods on synthetic datasets and classical benchmarks, showing the gains of the novel convex approach.

artificial intelligence, constraint, machine learning, (19 more...)

arXiv.org Machine Learning

1212.2573

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)

Add feedback

PAC-Bayesian Learning and Domain Adaptation

Germain, Pascal, Habrard, Amaury, Laviolette, François, Morvant, Emilie

arXiv.org Machine LearningDec-11-2012

In machine learning, Domain Adaptation (DA) arises when the distribution gen- erating the test (target) data differs from the one generating the learning (source) data. It is well known that DA is an hard task even under strong assumptions, among which the covariate-shift where the source and target distributions diverge only in their marginals, i.e. they have the same labeling function. Another popular approach is to consider an hypothesis class that moves closer the two distributions while implying a low-error for both tasks. This is a VC-dim approach that restricts the complexity of an hypothesis class in order to get good generalization. Instead, we propose a PAC-Bayesian approach that seeks for suitable weights to be given to each hypothesis in order to build a majority vote. We prove a new DA bound in the PAC-Bayesian context. This leads us to design the first DA-PAC-Bayesian algorithm based on the minimization of the proposed bound. Doing so, we seek for a \rho-weighted majority vote that takes into account a trade-off between three quantities. The first two quantities being, as usual in the PAC-Bayesian approach, (a) the complexity of the majority vote (measured by a Kullback-Leibler divergence) and (b) its empirical risk (measured by the \rho-average errors on the source sample). The third quantity is (c) the capacity of the majority vote to distinguish some structural difference between the source and target samples.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1212.234

Country: Europe > France (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Simulation-based optimal Bayesian experimental design for nonlinear systems

Huan, Xun, Marzouk, Youssef M.

arXiv.org Machine LearningNov-30-2012

The optimal selection of experimental conditions is essential to maximizing the value of data for inference and prediction, particularly in situations where experiments are time-consuming and expensive to conduct. We propose a general mathematical framework and an algorithmic approach for optimal experimental design with nonlinear simulation-based models; in particular, we focus on finding sets of experiments that provide the most information about targeted sets of parameters. Our framework employs a Bayesian statistical setting, which provides a foundation for inference from noisy, indirect, and incomplete data, and a natural mechanism for incorporating heterogeneous sources of information. An objective function is constructed from information theoretic measures, reflecting expected information gain from proposed combinations of experiments. Polynomial chaos approximations and a two-stage Monte Carlo sampling method are used to evaluate the expected information gain. Stochastic approximation algorithms are then used to make optimization feasible in computationally intensive and high-dimensional settings. These algorithms are demonstrated on model problems and on nonlinear parameter estimation problems arising in detailed combustion kinetics.

bayesian inference, experiment, optimization problem, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jcp.2012.08.013

1108.4146

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Bayesian learning of noisy Markov decision processes

Singh, Sumeetpal S., Chopin, Nicolas, Whiteley, Nick

arXiv.org Machine LearningNov-26-2012

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1211.5901

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Traveling Salesman Learns Bayesian Networks

Sahai, Tuhin, Klus, Stefan, Dellnitz, Michael

arXiv.org Machine LearningNov-20-2012

Structure learning of Bayesian networks is an important problem that arises in numerous machine learning applications. In this work, we present a novel approach for learning the structure of Bayesian networks using the solution of an appropriately constructed traveling salesman problem. In our approach, one computes an optimal ordering (partially ordered set) of random variables using methods for the traveling salesman problem. This ordering significantly reduces the search space for the subsequent greedy optimization that computes the final structure of the Bayesian network. We demonstrate our approach of learning Bayesian networks on real world census and weather datasets. In both cases, we demonstrate that the approach very accurately captures dependencies between random variables. We check the accuracy of the predictions based on independent studies in both application domains.

artificial intelligence, bayesian network, machine learning, (14 more...)

arXiv.org Machine Learning

1211.4888

Country:

North America > United States > Connecticut > Hartford County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.70)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A survey of non-exchangeable priors for Bayesian nonparametric models

Foti, Nicholas J., Williamson, Sinead

arXiv.org Machine LearningNov-20-2012

There has recently been a spate of papers in the statistics and machine learning literature developing dependent stochastic processes and using them as priors in Bayesian nonparametric models. In this paper, we aim to provide a representative snapshot of the currently available models, to elucidate links between these models, and to provide an orienting view of the modern constructions of these processes. Traditional nonparametric priors such as the Dirichlet process [DP, 2], Chinese restaurant process [CRP, 3], Pitman-Yor process [4] and the Indian buffet process [IBP, 5] assume that our observations are exchangeable. Under the assumption of exchangeability the order of the data points does not change the probability distribution. Exchangeability is not a valid assumption for all data.

artificial intelligence, machine learning, poisson process, (18 more...)

arXiv.org Machine Learning

1211.4798

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Overview (1.00)

Industry: Consumer Products & Services > Restaurants (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Add feedback

A unifying representation for a class of dependent random measures

Foti, Nicholas J., Futoma, Joseph D., Rockmore, Daniel N., Williamson, Sinead

arXiv.org Machine LearningNov-20-2012

We present a general construction for dependent random measures based on thinning Poisson processes on an augmented space. The framework is not restricted to dependent versions of a specific nonparametric model, but can be applied to all models that can be represented using completely random measures. Several existing dependent random measures can be seen as specific cases of this framework. Interesting properties of the resulting measures are derived and the efficacy of the framework is demonstrated by constructing a covariate-dependent latent feature model and topic model that obtain superior predictive performance.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1211.4753

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback