nested markov model
Limitation of intervention not changing parent set: There are many settings in the empirical sciences where
We would like to thank the reviewers for their comments and constructive feedback. Below, we address the main issues raised and clarify some misunderstandings. Also, the work of Y ang et al. (2018) characterizes soft interventions in systems without latent variables. Mooij et al. (2013) discussed interventions of this nature in the context of equilibrium in cyclic causal models. Usage of MAGs: The reviewer's observation only holds for hard interventions.
On the physics of nested Markov models: a generalized probabilistic theory perspective
Determining potential probability distributions with a given causal graph is vital for causality studies. To bypass the difficulty in characterizing latent variables in a Bayesian network, the nested Markov model provides an elegant algebraic approach by listing exactly all the equality constraints on the observed variables. However, this algebraically motivated causal model comprises distributions outside Bayesian networks, and its physical interpretation remains vague. In this work, we inspect the nested Markov model through the lens of generalized probabilistic theory, an axiomatic framework to describe general physical theories. We prove that all the equality constraints defining the nested Markov model hold valid theory-independently. Yet, we show this model generally contains distributions not implementable even within such relaxed physical theories subjected to merely the relativity principles and mild probabilistic rules. To interpret the origin of such a gap, we establish a new causal model that defines valid distributions as projected from a high-dimensional Bell-type causal structure. The new model unveils inequality constraints induced by relativity principles, or equivalently high-dimensional conditional independences, which are absent in the nested Markov model. Nevertheless, we also notice that the restrictions on states and measurements introduced by the generalized probabilistic theory framework can pose additional inequality constraints beyond the new causal model. As a by-product, we discover a new causal structure exhibiting strict gaps between the distribution sets of a Bayesian network, generalized probabilistic theories, and the nested Markov model. We anticipate our results will enlighten further explorations on the unification of algebraic and physical perspectives of causality.
Margins of discrete Bayesian networks
Bayesian network models with latent variables are widely used in statistics and machine learning. In this paper we provide a complete algebraic characterization of Bayesian network models with latent variables when the observed variables are discrete and no assumption is made about the state-space of the latent variables. We show that it is algebraically equivalent to the so-called nested Markov model, meaning that the two are the same up to inequality constraints on the joint probabilities. In particular these two models have the same dimension. The nested Markov model is therefore the best possible description of the latent variable model that avoids consideration of inequalities, which are extremely complicated in general. A consequence of this is that the constraint finding algorithm of Tian and Pearl (UAI 2002, pp519-527) is complete for finding equality constraints. Latent variable models suffer from difficulties of unidentifiable parameters and non-regular asymptotics; in contrast the nested Markov model is fully identifiable, represents a curved exponential family of known dimension, and can easily be fitted using an explicit parameterization.
Sparse Nested Markov models with Log-linear Parameters
Shpitser, Ilya, Evans, Robin J., Richardson, Thomas S., Robins, James M.
Hidden variables are ubiquitous in practical data analysis, and therefore modeling marginal densities and doing inference with the resulting models is an important problem in statistics, machine learning, and causal inference. Recently, a new type of graphical model, called the nested Markov model, was developed which captures equality constraints found in marginals of directed acyclic graph (DAG) models. Some of these constraints, such as the so called `Verma constraint', strictly generalize conditional independence. To make modeling and inference with nested Markov models practical, it is necessary to limit the number of parameters in the model, while still correctly capturing the constraints in the marginal of a DAG model. Placing such limits is similar in spirit to sparsity methods for undirected graphical models, and regression models. In this paper, we give a log-linear parameterization which allows sparse modeling with nested Markov models. We illustrate the advantages of this parameterization with a simulation study.
Parameter and Structure Learning in Nested Markov Models
Shpitser, Ilya, Richardson, Thomas S., Robins, James M., Evans, Robin
The constraints arising from DAG models with latent variables can be naturally represented by means of acyclic directed mixed graphs (ADMGs). Such graphs contain directed and bidirected arrows, and contain no directed cycles. DAGs with latent variables imply independence constraints in the distribution resulting from a 'fixing' operation, in which a joint distribution is divided by a conditional. This operation generalizes marginalizing and conditioning. Some of these constraints correspond to identifiable 'dormant' independence constraints, with the well known 'Verma constraint' as one example. Recently, models defined by a set of the constraints arising after fixing from a DAG with latents, were characterized via a recursive factorization and a nested Markov property. In addition, a parameterization was given in the discrete case. In this paper we use this parameterization to describe a parameter fitting algorithm, and a search and score structure learning algorithm for these nested Markov models. We apply our algorithms to a variety of datasets.