AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Unsupervised Sub-tree Alignment for Tree-to-Tree Translation

Xiao, T., Zhu, J.

Journal of Artificial Intelligence ResearchNov-22-2013

This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system.

alignment, probability, sub-tree alignment, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4033

AI Access Foundation

10850

Journal of Artificial Intelligence Research

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.04)
(25 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Pairwise Graphical Models with Nonlinear Sufficient Statistics

Yuan, Xiao-Tong, Li, Ping, Zhang, Tong

arXiv.org Machine LearningNov-22-2013

We investigate a generic problem of learning pairwise exponential family graphical models with pairwise sufficient statistics defined by a global mapping function, e.g., Mercer kernels. This subclass of pairwise graphical models allow us to flexibly capture complex interactions among variables beyond pairwise product. We propose two $\ell_1$-norm penalized maximum likelihood estimators to learn the model parameters from i.i.d. samples. The first one is a joint estimator which estimates all the parameters simultaneously. The second one is a node-wise conditional estimator which estimates the parameters individually for each node. For both estimators, we show that under proper conditions the extra flexibility gained in our model comes at almost no cost of statistical and computational efficiency. We demonstrate the advantages of our model over state-of-the-art methods on synthetic and real datasets.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1311.5479

Country: North America > United States > New Jersey (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Nonparametric Bayesian models of hierarchical structure in complex networks

Schmidt, Mikkel N., Herlau, Tue, Mørup, Morten

arXiv.org Machine LearningNov-21-2013

Analyzing and understanding the structure of complex relational data is important in many applications including analysis of the connectivity in the human brain. Such networks can have prominent patterns on different scales, calling for a hierarchically structured model. We propose two non-parametric Bayesian hierarchical network models based on Gibbs fragmentation tree priors, and demonstrate their ability to capture nested patterns in simulated networks. On real networks we demonstrate detection of hierarchical structure and show predictive performance on par with the state of the art. We envision that our methods can be employed in exploratory analysis of large scale complex networks for example to model human brain connectivity.

artificial intelligence, hierarchical structure, machine learning, (18 more...)

arXiv.org Machine Learning

1311.1033

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Streaming Variational Bayes

Broderick, Tamara, Boyd, Nicholas, Wibisono, Andre, Wilson, Ashia C., Jordan, Michael I.

arXiv.org Machine LearningNov-20-2013

We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a user-specified approximation batch primitive. We demonstrate the usefulness of our framework, with variational Bayes (VB) as the primitive, by fitting the latent Dirichlet allocation model to two large-scale document collections. We demonstrate the advantages of our algorithm over stochastic variational inference (SVI) by comparing the two after a single pass through a known amount of data---a case where SVI may be applied---and in the streaming setting, where SVI does not apply.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1307.6769

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.88)

Add feedback

A survey on independence-based Markov networks learning

Schlüter, Federico

arXiv.org Artificial IntelligenceNov-20-2013

Name Reference Comments KS Koller and Sahami (1996) - Not Sound - The first one of this type - Requires specifying MB size in advance GS Margaritis and Thrun (2000) - Sound in theory - Proposed to learn Bayesian network via the induction of neighbors of each variable - First proved such kind of algorithm - Works in two phases: grow and shrink IAMB and its variants Tsamardinos et al (2003) - Sound in theory - Actually variant of GS - Simple to implement - Time efficient - Very poor on data efficiency - IAMB's variants achieve better performance on data efficiency than IAMB HITON-PC/MB Aliferis et al (2003) - Not sound - Another trial to make use of the topology information to enhance data efficiency - Data efficiency comparable to IAMB - Much slower compared to IAMB Fast-IAMB Yaramakala and Margaritis (2005) - Sound in theory - No fundamental difference as compared to IAMB - Adds candidates more greedily to speed up the learning - Still poor on data efficiency performance MMPC/MB Tsamardinos et al (2006) - Not sound - The first to make use of the underling topology information - Much more data efficient compared to IAMB - Much slower compared to IAMB PCMB Peña et al (2007) - Sound in theory - Data efficient by making use of topology information - Poor on time efficiency - Distinguish spouses from parents/children - Distinguish some children from parents/children IPC-MB Fu and Desmarais (2008) - Sound in theory - Most data efficient compared with previous algorithms - Much faster than PCMB on computing - Distinguish spouses from parents/children - Distinguish some children from parents/children - Best tradeoff among this family of algorithms

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10462-012-9346-y

1108.2283

Country: North America > United States > California (0.28)

Genre:

Research Report (1.00)
Overview (0.67)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Nonparametric Bayes dynamic modeling of relational data

Durante, Daniele, Dunson, David B.

arXiv.org Machine LearningNov-19-2013

Symmetric binary matrices representing relations among entities are commonly collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being in inference on the relationship structure and prediction. We propose a nonparametric Bayesian dynamic model, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent space representation, with the latent coordinates evolving in continuous time via Gaussian processes. By using a logistic mapping function from the probability matrix space to the latent relational space, we obtain a flexible and computational tractable formulation. Employing P\`olya-Gamma data augmentation, an efficient Gibbs sampler is developed for posterior computation, with the dimension of the latent space automatically inferred. We provide some theoretical results on flexibility of the model, and illustrate performance via simulation experiments. We also consider an application to co-movements in world financial markets.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

arXiv.org Machine Learning

doi: 10.1093/biomet/asu040

1311.4669

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.29)

Genre: Research Report (0.50)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Reasoning with Uncertainties Over Existence of Objects

Ngo, Vien Anh (University of Stuttgart) | Toussaint, Marc (University of Stuttgart)

AAAI ConferencesNov-14-2013

In this paper we consider planning problems in relationalMarkov processes where objects may “appear” or “disap-pear”, perhaps depending on previous actions or propertiesof other objects. For instance, problems which require to ex-plicitly generate or discover objects fall into this category. Inour formulation this requires to explicitly represent the un-certainty over the number of objects (dimensions or factors)in a dynamic Bayesian networks (DBN). Many formalisms(also existing ones) are conceivable to formulate such prob-lems. We aim at a formulation that facilitates inference andplanning. Based on a specific formulation we investigate twoinference methods—rejection sampling and reversible-jumpMCMC—to compute a posterior over the process conditionedon the first and last time slice (start and goal state). We willdiscuss properties, efficiency, and appropriateness of eachone.

existence, machine learning, reasoning, (1 more...)

AAAI Conferences

2013 AAAI Fall Symposium Series

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Flexible sampling of discrete data correlations without the marginal distributions

Kalaitzis, Alfredo, Silva, Ricardo

arXiv.org Machine LearningNov-14-2013

Learning the joint dependence of discrete variables is a fundamental problem in machine learning, with many applications including prediction, clustering and dimensionality reduction. More recently, the framework of copula modeling has gained popularity due to its modular parameterization of joint distributions. Among other properties, copulas provide a recipe for combining flexible models for univariate marginal distributions with parametric families suitable for potentially high dimensional dependence structures. More radically, the extended rank likelihood approach of Hoff (2007) bypasses learning marginal models completely when such information is ancillary to the learning task at hand as in, e.g., standard dimensionality reduction problems or copula parameter estimation. The main idea is to represent data by their observable rank statistics, ignoring any other information from the marginals. Inference is typically done in a Bayesian framework with Gaussian copulas, and it is complicated by the fact this implies sampling within a space where the number of constraints increases quadratically with the number of data points. The result is slow mixing when using off-the-shelf Gibbs sampling. We present an efficient algorithm based on recent advances on constrained Hamiltonian Markov chain Monte Carlo that is simple to implement and does not require paying for a quadratic cost in sample size.

artificial intelligence, gaussian copula, machine learning, (15 more...)

arXiv.org Machine Learning

1306.2685

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

On Estimating Many Means, Selection Bias, and the Bootstrap

Simon, Noah, Simon, Richard

arXiv.org Machine LearningNov-14-2013

With recent advances in high throughput technology, researchers often find themselves running a large number of hypothesis tests (thousands+) and esti- mating a large number of effect-sizes. Generally there is particular interest in those effects estimated to be most extreme. Unfortunately naive estimates of these effect-sizes (even after potentially accounting for multiplicity in a testing procedure) can be severely biased. In this manuscript we explore this bias from a frequentist perspective: we give a formal definition, and show that an oracle estimator using this bias dominates the naive maximum likelihood estimate. We give a resampling estimator to approximate this oracle, and show that it works well on simulated data. We also connect this to ideas in empirical Bayes.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1311.3709

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

High-dimensional learning of linear causal networks via inverse covariance estimation

Loh, Po-Ling, Bühlmann, Peter

arXiv.org Machine LearningNov-14-2013

We establish a new framework for statistical estimation of directed acyclic graphs (DAGs) when data are generated from a linear, possibly non-Gaussian structural equation model. Our framework consists of two parts: (1) inferring the moralized graph from the support of the inverse covariance matrix; and (2) selecting the best-scoring graph amongst DAGs that are consistent with the moralized graph. We show that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l_2-loss. Our population-level results have implications for the identifiability of linear SEMs when the error covariances are specified up to a constant multiple. On the statistical side, we establish rigorous conditions for high-dimensional consistency of our two-part algorithm, defined in terms of a "gap" between the true DAG and the next best candidate. Finally, we demonstrate that dynamic programming may be used to select the optimal DAG in linear time when the treewidth of the moralized graph is bounded.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Machine Learning

1311.3492

Country:

Europe (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback