AITopics

2412.02893

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.47)
Law (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.83)

arXiv.org Machine LearningApr-12-2024

Multiply-Robust Causal Change Attribution

Quintas-Martinez, Victor, Bahadori, Mohammad Taha, Santiago, Eduardo, Mu, Jeff, Janzing, Dominik, Heckerman, David

Comparing two samples of data, we observe a change in the distribution of an outcome variable. In the presence of multiple explanatory variables, how much of the change can be explained by each possible cause? We develop a new estimation strategy that, given a causal model, combines regression and re-weighting methods to quantify the contribution of each causal mechanism. Our proposed methodology is multiply robust, meaning that it still recovers the target parameter under partial misspecification. We prove that our estimator is consistent and asymptotically normal. Moreover, it can be incorporated into existing frameworks for causal attribution, such as Shapley values, which will inherit the consistency and large-sample distribution properties. Our method demonstrates excellent performance in Monte Carlo simulations, and we show its usefulness in an empirical application.

artificial intelligence, causal mechanism, machine learning, (17 more...)

2404.08839

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Artificial IntelligenceJan-7-2024

Heckerthoughts

Heckerman, David

This manuscript is technical memoir about my work at Stanford and Microsoft Research. Included are fundamental concepts central to machine learning and artificial intelligence, applications of these concepts, and stories behind their creation.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2302.05449

Country:

Europe (0.67)
North America > United States > California > San Mateo County (0.14)
North America > United States > California > Santa Clara County (0.14)
(3 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.67)

arXiv.org Artificial IntelligenceJan-24-2023

An Empirical Comparison of Three Inference Methods

Heckerman, David

Several years ago, before learning much about methods for reasoning with uncertainty, I and my colleagues began work on a large expert system, called Pathfinder, that assists community pathologists with the diagnosis of lymph node pathology. Because the Dempster-Shafer theory of belief was quite popular in our research group at the time, we developed a inference method for our expert system inspired by this theory. The program performed fairly well in the opinion of the expert pathologist who provided the knowledge for the system. In the months following the initial development of Pathfinder, several of us in the research group began exploring other methods for reasoning under uncertainty. We identified the Bayesian approach as a candidate for a new inference procedure. We realized that the measures of uncertainty we assessed from the expert could be interpreted as probabilities and we implemented a new inference method-- a special case of Bayes' theorem. During this time, the expert was running cases through the program to test the system's diagnostic performance. One day, without telling him, we changed the inference procedure to the Bayesian approach. After running several cases with the new approach, the expert exclaimed, "What did you do to the program?

artificial intelligence, bayesian inference, machine learning, (20 more...)

1304.2357

Country: North America > United States > California (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Oncology (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-13-2021

Likelihoods and Parameter Priors for Bayesian Networks

Heckerman, David, Geiger, Dan

We develop simple methods for constructing likelihoods and parameter priors for learning about the parameters and structure of a Bayesian network. In particular, we introduce several assumptions that permit the construction of likelihoods and parameter priors for a large number of Bayesian-network structures from a small set of assessments. The most notable assumption is that of likelihood equivalence, which says that data can not help to discriminate network structures that encode the same assertions of conditional independence. We describe the constructions that follow from these assumptions, and also present a method for directly computing the marginal likelihood of a random sample with no missing observations. Also, we show how these assumptions lead to a general framework for characterizing parameter priors of multivariate distributions.

artificial intelligence, assumption, bayesian inference, (19 more...)

2105.06241

Country:

North America > United States > Washington > King County (0.14)
North America > United States > California > San Mateo County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceMay-13-2021

Learning Bayesian Networks: A Unification for Discrete and Gaussian Domains

Heckerman, David, Geiger, Dan

At last year's conference, we presented approaches for learning Bayesian networks from a combination of prior knowledge and statistical data. These approaches were presented in two papers: one addressing domains containing only discrete variables (Heckerman et al., 1994), and the other addressing domains containing continuous variables related by an unknown multivariate-Gaussian distribution (Geiger and Heckerman, 1994). Unfortunately, these presentations were substantially different, making the parallels between the two methods difficult to appreciate. In this paper, we unify the two approaches. In particular, we abstract our previous assumptions of likelihood equivalence, parameter modularity, and parameter independence such that they are appropriate for discrete and Gaussian domains (as well as other domains). Using these assumptions, we derive a domain-independent Bayesian scoring metric. We then use this general metric in combination with well-known statistical facts about the Dirichlet and normal-Wishart distributions to derive our metrics for discrete and Gaussian domains. In addition, we provide simple proofs that these assumptions are consistent for both domains.

artificial intelligence, machine learning, network structure, (16 more...)

1302.4957

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-5-2021

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Geiger, Dan, Heckerman, David

We develop simple methods for constructing parameter priors for model choice among Directed Acyclic Graphical (DAG) models. In particular, we introduce several assumptions that permit the construction of parameter priors for a large number of DAG models from a small set of assessments. We then present a method for directly computing the marginal likelihood of every DAG model given a random sample with no missing observations. We apply this methodology to Gaussian DAG models which consist of a recursive set of linear regression models. We show that the only parameter prior for complete Gaussian DAG models that satisfies our assumptions is the normal-Wishart distribution. Our analysis is based on the following new characterization of the Wishart distribution: let $W$ be an $n \times n$, $n \ge 3$, positive-definite symmetric matrix of random variables and $f(W)$ be a pdf of $W$. Then, f$(W)$ is a Wishart distribution if and only if $W_{11} - W_{12} W_{22}^{-1} W'_{12}$ is independent of $\{W_{12},W_{22}\}$ for every block partitioning $W_{11},W_{12}, W'_{12}, W_{22}$ of $W$. Similar characterizations of the normal and normal-Wishart distributions are provided as well.

artificial intelligence, bayesian inference, dag model, (18 more...)

2105.03248

Country:

North America > United States > California > San Mateo County (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceOct-21-2019

Embedded Bayesian Network Classifiers

Heckerman, David, Meek, Chris

Low-dimensional probability models for local distribution functions in a Bayesian network include decision trees, decision graphs, and causal independence models. We describe a new probability model for discrete Bayesian networks, which we call an embedded Bayesian network classifier or EBNC. The model for a node $Y$ given parents $\bf X$ is obtained from a (usually different) Bayesian network for $Y$ and $\bf X$ in which $\bf X$ need not be the parents of $Y$. We show that an EBNC is a special case of a softmax polynomial regression model. Also, we show how to identify a non-redundant set of parameters for an EBNC, and describe an asymptotic approximation for learning the structure of Bayesian networks that contain EBNCs. Unlike the decision tree, decision graph, and causal independence models, we are unaware of a semantic justification for the use of these models. Experiments are needed to determine whether the models presented in this paper are useful in practice.

artificial intelligence, machine learning, null, (15 more...)

1910.09715

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-16-2015

Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions

Geiger, Dan, Heckerman, David

We show that the only parameter prior for complete Gaussian DAG models that satisfies global parameter independence, complete model equivalence, and some weak regularity assumptions, is the normal-Wishart distribution. Our analysis is based on the following new characterization of the Wishart distribution: let W be an n x n, n >= 3, positive-definite symmetric matrix of random variables and f(W) be a pdf of W. Then, f(W) is a Wishart distribution if and only if W_{11}-W_{12}W_{22}^{-1}W_{12}' is independent of {W_{12}, W_{22}} for every block partitioning W_{11}, W_{12}, W_{12}', W_{22} of W. Similar characterizations of the normal and normal-Wishart distributions are provided as well. We also show how to construct a prior for every DAG model over X from the prior of a single regression model.

artificial intelligence, bayesian inference, dag model, (19 more...)

1301.6697

Country:

North America > United States > Washington > King County (0.14)
North America > United States > California > San Mateo County (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

arXiv.org Machine LearningMay-16-2015

Fast Learning from Sparse Data

Chickering, David Maxwell, Heckerman, David

We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.

algorithm, artificial intelligence, bayesian inference, (19 more...)

1301.6685

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)