AITopics

2105.06241

Country:

North America > United States > Washington > King County (0.14)
North America > United States > California > San Mateo County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceMay-13-2021

Learning Bayesian Networks: A Unification for Discrete and Gaussian Domains

Heckerman, David, Geiger, Dan

At last year's conference, we presented approaches for learning Bayesian networks from a combination of prior knowledge and statistical data. These approaches were presented in two papers: one addressing domains containing only discrete variables (Heckerman et al., 1994), and the other addressing domains containing continuous variables related by an unknown multivariate-Gaussian distribution (Geiger and Heckerman, 1994). Unfortunately, these presentations were substantially different, making the parallels between the two methods difficult to appreciate. In this paper, we unify the two approaches. In particular, we abstract our previous assumptions of likelihood equivalence, parameter modularity, and parameter independence such that they are appropriate for discrete and Gaussian domains (as well as other domains). Using these assumptions, we derive a domain-independent Bayesian scoring metric. We then use this general metric in combination with well-known statistical facts about the Dirichlet and normal-Wishart distributions to derive our metrics for discrete and Gaussian domains. In addition, we provide simple proofs that these assumptions are consistent for both domains.

artificial intelligence, machine learning, network structure, (16 more...)

1302.4957

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-5-2021

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Geiger, Dan, Heckerman, David

We develop simple methods for constructing parameter priors for model choice among Directed Acyclic Graphical (DAG) models. In particular, we introduce several assumptions that permit the construction of parameter priors for a large number of DAG models from a small set of assessments. We then present a method for directly computing the marginal likelihood of every DAG model given a random sample with no missing observations. We apply this methodology to Gaussian DAG models which consist of a recursive set of linear regression models. We show that the only parameter prior for complete Gaussian DAG models that satisfies our assumptions is the normal-Wishart distribution. Our analysis is based on the following new characterization of the Wishart distribution: let $W$ be an $n \times n$, $n \ge 3$, positive-definite symmetric matrix of random variables and $f(W)$ be a pdf of $W$. Then, f$(W)$ is a Wishart distribution if and only if $W_{11} - W_{12} W_{22}^{-1} W'_{12}$ is independent of $\{W_{12},W_{22}\}$ for every block partitioning $W_{11},W_{12}, W'_{12}, W_{22}$ of $W$. Similar characterizations of the normal and normal-Wishart distributions are provided as well.

artificial intelligence, bayesian inference, dag model, (18 more...)

2105.03248

Country:

North America > United States > California > San Mateo County (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-16-2015

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Geiger, Dan, Heckerman, David

We show that the only parameter prior for complete Gaussian DAG models that satisfies global parameter independence, complete model equivalence, and some weak regularity assumptions, is the normal-Wishart distribution. Our analysis is based on the following new characterization of the Wishart distribution: let W be an n x n, n >= 3, positive-definite symmetric matrix of random variables and f(W) be a pdf of W. Then, f(W) is a Wishart distribution if and only if W_{11}-W_{12}W_{22}^{-1}W_{12}' is independent of {W_{12}, W_{22}} for every block partitioning W_{11}, W_{12}, W_{12}', W_{22} of W. Similar characterizations of the normal and normal-Wishart distributions are provided as well. We also show how to construct a prior for every DAG model over X from the prior of a single regression model.

artificial intelligence, bayesian inference, dag model, (19 more...)

1301.6697

Country:

North America > United States > Washington > King County (0.14)
North America > United States > California > San Mateo County (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

arXiv.org Machine LearningMay-16-2015

Asymptotic Model Selection for Directed Networks with Hidden Variables

Geiger, Dan, Heckerman, David, Meek, Christopher

We extend the Bayesian Information Criterion (BIC), an asymptotic approximation for the marginal likelihood, to Bayesian networks with hidden variables. This approximation can be used to select models given large samples of data. The standard BIC as well as our extension punishes the complexity of a model according to the dimension of its parameters. We argue that the dimension of a Bayesian network with hidden variables is the rank of the Jacobian matrix of the transformation between the parameters of the network and the parameters of the observable variables. We compute the dimensions of several networks including the naive Bayes model with a hidden root node.

artificial intelligence, bayesian inference, bayesian network, (17 more...)

1302.358

Country:

North America > United States > California (0.28)
North America > United States > Washington > King County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceMar-27-2013

d-Separation: From Theorems to Algorithms

Geiger, Dan, Verma, Tom S., Pearl, Judea

An efficient algorithm is developed that identifies all independencies implied by the topology of a Bayesian network. Its correctness and maximality stems from the soundness and completeness of d-separation with respect to probability theory. The algorithm runs in time O (l E l) where E is the number of edges in the network.

algorithm, artificial intelligence, bayesian inference, (20 more...)

1304.1505

Country:

Europe (0.68)
North America > United States > California > San Mateo County (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.37)

arXiv.org Artificial IntelligenceMar-13-2013

An Entropy-based Learning Algorithm of Bayesian Conditional Trees

Geiger, Dan

This article offers a modification of Chow and Liu's learning algorithm in the context of handwritten digit recognition. The modified algorithm directs the user to group digits into several classes consisting of digits that are hard to distinguish and then constructing an optimal conditional tree representation for each class of digits instead of for each single digit as done by Chow and Liu (1968). Advantages and extensions of the new method are discussed. Related works of Wong and Wang (1977) and Wong and Poon (1989) which offer a different entropy-based learning algorithm are shown to rest on inappropriate assumptions.

artificial intelligence, chow and liu, machine learning, (16 more...)

1303.5403

Country:

Asia > Middle East > Israel (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningFeb-27-2013

Learning Gaussian Networks

Geiger, Dan, Heckerman, David

We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Previous work has concentrated on metrics for domains containing only discrete variables, under the assumption that data represents a multinomial sample. In this paper, we extend this work, developing scoring metrics for domains containing all continuous variables or a mixture of discrete and continuous variables, under the assumption that continuous data is sampled from a multivariate normal distribution. Our work extends traditional statistical approaches for identifying vanishing regression coefficients in that we identify two important assumptions, called event equivalence and parameter modularity, that when combined allow the construction of prior distributions for multivariate normal parameters from a single prior Bayesian network specified by a user.

artificial intelligence, belief network, machine learning, (13 more...)

1302.6808

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceFeb-13-2013

A Sufficiently Fast Algorithm for Finding Close to Optimal Junction Trees

Becker, Ann, Geiger, Dan

An algorithm is developed for finding a close to optimal junction tree of a given graph G. The algorithm has a worst case complexity O(c^k n^a) where a and c are constants, n is the number of vertices, and k is the size of the largest clique in a junction tree of G in which this size is minimized. The algorithm guarantees that the logarithm of the size of the state space of the heaviest clique in the junction tree produced is less than a constant factor off the optimal value. When k = O(log n), our algorithm yields a polynomial inference algorithm for Bayesian networks.

algorithm, artificial intelligence, machine learning, (17 more...)

1302.3558

Country:

Europe (0.67)
North America > United States (0.46)
Asia > Middle East > Israel (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJan-30-2013

Graphical Models and Exponential Families

Geiger, Dan, Meek, Christopher

We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and non-independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined.

artificial intelligence, bayesian inference, exponential family, (17 more...)

1301.7376

Country:

North America > United States > Washington > King County (0.14)
North America > United States > New Jersey (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)